home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
c
/
cxt220.zip
/
SXT.DOC
< prev
Wrap
Text File
|
1994-03-20
|
184KB
|
4,210 lines
SXT (TM) SOFTWARE EXPLORATION TOOLS
CXT (TM) C EXPLORATION TOOLS
* CFT (TM) C FUNCTION TREE GENERATOR
* CST (TM) C STRUCTURE TREE GENERATOR
DXT (TM) DBASE EXPLORATION TOOLS
* DFT (TM) DBASE FUNCTION TREE GENERATOR
FXT (TM) FORTRAN EXPLORATION TOOLS
* FFT (TM) FORTRAN FUNCTION TREE GENERATOR
LXT (TM) LISP EXPLORATION TOOLS
* LFT (TM) LISP FUNCTION TREE GENERATOR
Version March 1994
Copyright (C) Juergen Mueller (J.M.) 1988-1994.
All rights reserved world-wide.
- 1 -
DISCLAIMER OF WARRANTY
THIS SOFTWARE AND ACCOMPANYING WRITTEN MATERIALS (INCLUDING
INSTRUCTIONS FOR USE) IS PROVIDED "AS IS" AND WITHOUT WARRANTY OF
ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT
LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY OR FITNESS
FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE RESULTS AND
PERFORMANCE OF THE SOFTWARE IS WITH YOU.
IN NO EVENT WILL THE AUTHOR AND COPYRIGHT HOLDER BE LIABLE FOR
DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR OTHER
DIRECT, INDIRECT, GENERAL, SPECIAL, INCIDENTAL, EXEMPLARY OR
CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OR
INABILITY TO USE THIS PROGRAM (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, BUSINESS
INTERRUPTION, LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR
LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE
PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) AND ON ANY THEORY OF
LIABILITY, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, OR
FOR ANY CLAIM BY ANY OTHER PARTY.
ACKNOWLEDGEMENT
BY USING THIS SOFTWARE YOU ACKNOWLEDGE THAT YOU HAVE READ THIS
LIMITED WARRANTY AND ACCOMPANYING REMARKS, UNDERSTAND IT, AND
AGREE TO BE BOUND BY ITS TERMS AND CONDITIONS. YOU ALSO AGREE
THAT THIS IS THE COMPLETE AND EXCLUSIVE STATEMENT OF AGREEMENT
BETWEEN THE PARTIES AND SUPERSEDE ALL PROPOSALS OR PRIOR
AGREEMENTS, ORAL OR WRITTEN, AND ANY OTHER COMMUNICATIONS BETWEEN
THE PARTIES RELATING TO THE SUBJECT MATTER OF THE LIMITED
WARRANTY.
You are expressly prohibited from selling this software or parts
of it in any form, circulate it in any incomplete or modified
form, distribute it with another product (except on CD-ROM) or
removing this notice. No one may modify or patch any of the
executable files in any way, including, but not limited to,
decompiling, disassembling or otherwise reverse engineering this
software in whole or part.
The documentation may be distributed verbatim, but changing is
not allowed. The informations and specifications in this document
are subject to change without notice.
THIS VERSION OF THE DOCUMENTATION, SOFTWARE AND COPYRIGHT
SUPERSEDES ALL PREVIOUS VERSIONS.
- 2 -
This software and documentation is Copyright (C) by
Juergen Mueller
Aldingerstrasse 22
D-70806 Kornwestheim
GERMANY
Email address: xmr@isw.uni-stuttgart.de
xmr@iswfs2.isw.uni-stuttgart.de
There are no relations between the authors professional work and
the SXT development. SXT is an independent private project of the
author.
- 3 -
LICENSE
This version of the SXT Software Exploration Tools is NOT public
domain or free software, but is being distributed as SHAREWARE.
Non-registered users of this software are granted a limited
license for a 30-day evaluation period starting from the day of
the first use to make an evaluation copy for trial use for the
express purpose of determining whether this software is suitable
for their needs. At the end of this trial period you should
either register your copy or discontinue using this software. The
use of unregistered copies of this software, outside of the
initial 30-day trial, by any person, business, corporation,
government agency or any other entity is strictly prohibited.
This means that if you use this software, then you should pay for
your copy. This software is NOT free, but you have the
opportunity to try it before you buy it. Either pay for it, or
quit using it. A registration entitles you to use your copy of
this software on any and all computers available to you. If other
people have access to this software or may use it, then
additional copies or a site license should be purchased.
All users are granted a limited license to copy this software
only for the trial use of others and subject to the above
limitations. This license does NOT include distribution, selling
or copying of this software package in connection with any other
product or service or for distribution in any incomplete or
modified form. Operators of electronic bulletin board systems and
software servers (like INTERNET FTP-Servers) are encouraged to
post this software for downloading by their users, as long as the
above conditions are met.
This package is expected to be distributed by shareware and
freeware channels, but the fees paid for "distribution" costs
(e.g. disk, CD-ROM) are strictly exchanged between the
distributor and the recipient, and the author makes no express or
implied warranties about the quality or integrity of such
indirectly acquired copies. Distributors and users may obtain the
package directly from the author by following the ordering
procedures in the REGISTER files.
REGISTRATION REMINDER
Unregistered copies of this software are 100% fully functional. I
make them this way so that you can have a real look at them, and
then decide whether they fit your needs or not. This work depends
on your honesty. If you use it, I expect you to pay for it. When
you pay for the shareware you like, you are voting with your
pocketbook, and will encourage me and others to develop more of
these kinds of products.
THANK YOU FOR SUPPORTING THE SHAREWARE CONCEPT
- 4 -
TABLE OF CONTENTS
1 THE SXT SOFTWARE EXPLORATION TOOLS
2 GENERAL INTRODUCTION
3 PROGRAM DESCRIPTION
4 LANGUAGE IMPLEMENTATIONS
4.1 C-LANGUAGE IMPLEMENTATION AND C-PREPROCESSOR
4.2 C++ SOURCE CODE
4.3 DBASE SOURCE CODE
4.4 FORTRAN SOURCE CODE
4.5 LISP SOURCE CODE
4.6 ASSEMBLER SOURCE CODE
5 DATABASE GENERATION
6 PROGRAM LIMITATIONS
7 IMPROVING EXECUTION SPEED
8 COMMAND LINE SYNTAX DESCRIPTION
9 OUTPUT DESCRIPTION AND INTERPRETATION
10 INTEGRATION INTO PROGRAM DEVELOPMENT ENVIRONMENTS
11 TOOLS FOR DATABASE PROCESSING
12 TROUBLE SHOOTING
13 FREQUENTLY ASKED QUESTIONS
14 REFERENCES
15 TRADEMARKS
APPENDIX 1: C-PRECOMPILER DEFINES
APPENDIX 2: RESERVED C/C++ KEYWORDS
APPENDIX 3: EFFICIENCY
APPENDIX 4: SYSTEM REQUIREMENTS
APPENDIX 5: INSTALLATION
- 5 -
1 THE SXT SOFTWARE EXPLORATION TOOLS
The SXT Software Exploration Tools are a collection of software
analysis tools providing a similar functionality for different
programming languages. The following packages are currently
available:
* CXT - C Exploration Tools:
CFT - C Function Tree Generator
Tool to analyse and display the function call relationships
within the source code of C/C++ programs.
CST - C Structure Tree Generator
Tool to analyse and display the structure/class
relationships within the source code of C/C++ programs.
* DXT - DBASE Exploration Tools:
DFT - DBASE Function Tree Generator
Tool to analyse and display the function call relationships
within the source code of DBASE, CLIPPER, FOXBASE and other
XBASE-like programs.
* FXT - FORTRAN Exploration Tools:
FFT - FORTRAN Function Tree Generator
Tool to analyse and display the function call relationships
within the source code of FORTRAN programs.
* LXT - LISP Exploration Tools:
LFT - LISP Function Tree Generator
Tool to analyse and display the function call relationships
within the source code of LISP and SCHEME programs.
Each of these packages consists of the analysis program and a
recall program ("Navigator") to recall the analysis results which
can be stored in a database, plus documentation and additional
macros to integrate these tools into popular editors like BRIEF,
QEDIT or MicroEMACS.
Each of these packages is available for the following systems:
* DOS real mode (shareware release)
* DOS 386 protected mode (registered users only)
* WINDOWS NT text mode (registered users only)
* OS/2 text mode (registered users only)
* IMPORTANT * IMPORTANT * IMPORTANT * IMPORTANT * IMPORTANT *
Although this document is mainly based on the description for the
CXT programs CFT and CST (which were up to version 2.13 the only
public available SXT programs) and therefore very C/C++ related,
the description applies in the same way to all other SXT
packages. The names CXT resp. CFT/CST and CFTN/CSTN can be
exchanged by the similar other product names. Where necessary,
the specific differences of the SXT packages are described. I
have done it this way to ensure an overall consistency, to keep
all related things together and to reduce the efforts for writing
and maintaining this document.
- 6 -
2 GENERAL INTRODUCTION
The CXT programs are powerful program development, maintenance
and documentation tools. They provide the programmer the ability
to analyse the source code of applications, no matter how big or
complex they are. The CXT programs are also very useful to
explore unknown source code and to get complete overview about
its internal structure. The re-engineering of old and/or
undocumented source code becomes an easy task with these
programs. The tools help the programmer to analyse, identify,
locate and access all parts of a large software system. They are
designed to support software reuse, maintenance and reliability.
By preprocessing, scanning and analysing the entire program
source code as a single unit, these programs build an internal
representation of the function call hierarchy (CFT) and of the
data structure relations (CST). The resulting output shows from a
global perspective the interdependencies and hierarchical
structure between the functions or data types of the whole, multi
file, software project. Several features and options allow the
user to customise the generated hierarchy tree chart output and
to get a large set of useful informations about the source code.
The hierarchy structure is always up-to-date because it relies on
the original source code as the primary source of information.
Written software documentation often differs from that what
really has been coded, so the source code itself is the ultimate
documentation.
An important feature is the database generation. It allows the
recalling of informations without reprocessing the source code.
The database can again be read in by CFT and CST to produce
different outputs or to add new files to the database. Special
recall programs called CFTN and CSTN allow fast searching for
items in the database. These programs can be used within any
environment, for example on the DOS command line or from inside
editors like BRIEF, QEDIT or MicroEMACS (DOS and WINDOWS), to
provide a full software project management system with access to
all functions and data types with just a keystroke. These
features make a comfortable "hypertext source code browser and
locator" system out of your editor. A project consisting of
several files appears to the developer as if it were a
'whole-part' of software. The developer can walk through programs
and trace the logic without having to memorize the directories
and files where functions or data types are defined and called.
Displaying and printing a graphical representation of the
analysis results as a call graph is not supported bye the SXT
programs but owners of RATIONAL ROSE, a powerful software
development case tool supporting the Booch Object-Oriented
Analysis and Design (OOAD) method, can use this tool for such
purposes. The SXT programs can generate compatible output which
can be imported by Rational Rose. See option -RATIONAL for a
detailed description.
- 7 -
Listings of all functions/data types and source files can be
written as formatted ASCII text files and can be used as input
for other programs like word processors or spreadsheet
calculators.
A useful option of CST is the possibility to generate a source
file with which size and byte offset calculations for
structures/unions and their members can be performed. This option
is useful especially to support any kind of error searching or
hardware debugging, for example with an ICE, or if data
structures have to be exchanged between different hardware
platforms.
CFT can also be used to analyse "C"-like languages as they are
used by several commercial programs. The macro programming
languages of the BRIEF, EPSILON and ME editors are such languages
and can be handled by CFT.
The resulting output files can be used for various purposes like
development or documentation. There are no restriction limits in
using them for your own work.
CFT and CST have been used and tested since 1989 in several
projects with applications ranging from single source files over
medium sized projects (like CFT, CST and the other SXT tools
themselves) up to very large software projects with hundreds of
source and include files (mixed C and assembler code), more than
6 MB of source code, more than 200000 lines, 2000 functions and
500 data types.
A lot of public available C/C++ sources (e.g. GNU-C compiler,
GNU-C library, GNU-EMACS, MicroEMACS, NCSA TCP/IP communication
software package, SUIT - The Simple User Interface Toolkit, NIHCL
- The National Institute of Health C++ class library, F2C
Fortran-to-C translator, several projects from Dr. Dobbs Journal
(DFLAT and BOB), Microsoft sample code (MFC 1.0 and 2.0)) were
processed (with sometimes surprising results!) during the
development and have been used to test and improve the features,
reliability, correctness, robustness and execution speed of CFT,
CST and their related utilities.
Although the other SXT packages are much newer than CFT and CST,
they all are closely related. The CXT tools were used as the base
for all other packages.
- 8 -
3 PROGRAM DESCRIPTION
CFT builds a hierarchy tree chart of every function with the
called functions in it's own function block. These functions are
again used as a starting point for subsequent function blocks.
Starting the tree chart with the "main"-function it will display
the complete function flow chart and the function hierarchy
dependency of the whole application with all user defined
functions and the called library functions. Prototyped but never
defined or called functions are also detected. Recursive calls of
functions are recognised and displayed, even over several call
levels. Repeated calls of previously displayed functions in the
output tree chart are detected and a message will be given with a
reference to their first appearance. This prevents the output of
complete subtrees displayed earlier. Overloaded C++ functions and
operators are recognised and displayed with the number of
overloadings.
CST acts similar to CFT but it works on data types like basic
types, structures, unions, enumerations and C++ classes. CST
builds a hierarchy tree chart of every structure and union data
type with their internal elements and their related data types.
If these data types are again structures, unions or classes, the
substructures will again be displayed. CST recognises data types
defined by 'typedef' and derived from other data types. The type
names corresponding to the same basic type are displayed in the
output file as 'alias' names for their common basic data type
name. Every feature of CFT like the detection of recursive
declared structures and unions, references to previously
displayed data types and others are available and act similar.
Every function (CFT) and data type (CST) can be displayed with
the name of the source file and the line number where it is
defined. The output can be customised to display the tree chart
as a call-tree ("CALLER-CALLEE"-relation: "WHO CALLS WHOM") or as
a caller-tree ("CALLEE-CALLER"-relation: "WHO IS CALLED BY
WHOM"). This feature allows the user to determine which functions
are called from a specific function or which functions are
callers of a specific function.
The function and data type extraction from the source code is
done by scanning and parsing the source. There is absolutely no
need for the programmer to mark functions or data types of
interest, for example with special keywords, starting the
definitions at the beginning of a line or to use comments
containing special marks, as it is necessary for other source
code analysers and browsers. CFT and CST do not need these
work-arounds, any source code can be processed without previous
work. These tools are also compiler independent because they can
be customised to support any kind of compiler.
- 9 -
Several useful informations and software metrics about the
processed source code and the included files can be generated
like
- file size and comment size in bytes for every file,
- number of source code lines for every file,
- number of included files for every source file,
- total effective number of scanned bytes and lines for every
source file and its included files, if files are included
multiple times, this will influence the calculations,
- for every defined function the number of lines, the code and
comment size in bytes, the number of bytes per line, the
number of functions called, the number of flow control
statements (if, else, for, while, case, default, goto,
return, exit), the maximum brace nesting level and if the
function is used only inside the file,
- for every defined structure/union the total number of
elements and the number of elements which are themselves
structures/unions,
- file function or data type reference list for every file,
- total number of displayed, defined, undefined or multiple
defined functions and data types,
- location of all multiple defined functions and data types,
- location of all overloaded C++ functions,
- source file - include file dependencies for every source
file,
- final statistical summary for all files,
- cross reference of every occurrence for every function or
data type,
- parent/children relationship for every function and data
type,
- critical function call path/structure nesting with deepest
non-recursive nesting level (unlimited tree depth),
- C++ class inheritance graph,
and much more ...
The resulting hierarchy structure chart is another representation
for a directed call graph. A directed call graph consists of
nodes (functions or data types) and connections (call relations)
between these nodes. The number of nodes and connections which
are necessary to transform the hierarchy structure chart into a
directed call graph will also be calculated as an additional
information about the system complexity.
A large number of options to control the program execution and
the output generation are available and can be defined on the
command line, by command files or by defining them in an
environment variable used by the program.
CFT and CST can be directly invoked from inside editors or
integrated development environments like the Borland C++ IDE.
Detailed examples for the integration together with necessary
macro or batch files are given.
- 10 -
3 LANGUAGE IMPLEMENTATIONS
3.1 C-LANGUAGE IMPLEMENTATION AND C-PREPROCESSOR
The ISO/ANSI C language standard ISO/IEC 9899:1990 (E) resp.
X3.159-1989-ANSI C as described in several books about the
C-language (see references) was used as a development base. The
reserved keywords being recognised are not only the original
ISO/ANSI C keywords but were also taken from several compiler
implementations like Microsoft, Borland or GNU and their own
special language extensions. The books "The C++ Programming
Language" and "The Annotated C++ Reference Manual" (ARM) together
with informations about the work of the ANSI C++ committee X3J16
resp. the ISO/IEC working group SC22 WG21 were used for the C++
keywords. Another major source was the AT&T C++ release 2.1.
Compiler specific extensions especially from GNU are also
recognised. Proposed extensions to C++ like additional keywords
(e.g. wchar_t) and the so called 'digraphs' will be supported if
they are introduced into the C++ language standard.
A complete list of all reserved keywords is show in appendix 2.
The large set of keywords may lead to some slight problems in
situations where a keyword is not used as itself but as an
identifier name, for example a C++ keyword used as an identifier
in C.
During a normal file scan, precompiler defines are, if possible,
handled as if a real precompiler would be present, but this can
cause some trouble with '#if', '#ifdef' and other precompiler
controls which are not evaluated. Also the block nesting level,
which will be monitored by the source code scanner, may not be at
level 0 at the end of the file because of such precompiler
controls. To avoid such things, a built-in C-preprocessor allows
the complete preprocessing of the source code and include files
for several compiler types as an additional option (-P).
Preprocessing or not is a little bit controversial because it can
either result in a loss of information if macros are used to
change the program behaviour and hide function calls, it can lead
to errors during file scanning or it can change the function and
data type informations obtained from the code which may not
exactly correspond to the visible source code. Preprocessing can
be an advantage or not, so the user has to decide whether he does
it or not.
The preprocessor handles the defines for Microsoft C 5.1,
Microsoft C/C++ 7.0, Microsoft VC++ 1.0 for Windows NT (Beta
Release June 1993), Turbo C++ 1.0, Borland C++ 2.0, Borland C++
3.1, GNU-C and Intel 80960 C compiler iC960 3.0 and all memory
models (not necessary for GNU-C and I960) or CPU architectures
for the Intel 80960 32 bit RISC processor (KA, KB, SA, SB, MC,
CA). Other compiler types can be customised with the -B and the
-D options. The default ISO/ANSI C predefined macros '__FILE__',
'__LINE__', '__DATE__', '__TIME__' are generated for
preprocessing. The macro '__STDC__' is NOT defined (some
compilers test with '#ifndef __STDC__'), so that non standard
- 11 -
ISO/ANSI C extensions in the processed code are allowed. Defining
'-D__STDC__=1' forces ISO/ANSI C conforming output (if used by
the scanned source code, of course!). Additional supported
precompiler defines are '__TIMESTAMP__', '__BASE_FILE__' and
'__INCLUDE_LEVEL__'. A list of the predefined preprocessor
defines for the supported compiler types is shown in appendix 1.
Features like the replacing of trigraphs and the recognition of
C++ comments '//...' are also treated by the preprocessor.
The precompiler recognises several errors or possible sources for
problems like
- the use of undefined variables in precompiler controls,
- misbalanced '#if...' control block(s) including the exact
location (file, line) where the failing block started,
- recursive called include files,
- wrong number of macro arguments (missing ones or too many)
and displays diagnostic messages with an exact description of the
error or warning reason and its location in the source file.
3.2 C++ SOURCE CODE
Although CFT and CST were initially not developed to process C++
code it is possible to do so. In that case, however, some
restrictions and limitations should be considered.
The recognition of C++ classes by CST is limited because the
handling of the internal class structure items (variables and
functions) is too complex to fit in the CST program. So classes
are only referenced by name but their internal structure will not
be scanned and displayed. The C++ class inheritance relationships
are recognised and shown in a class hierarchy graph listing
(option -b). Structures in C++ with function names as structure
members will not be processed correctly. Templates are not
supported and will not be recognised.
Calling member functions will not be recognised correctly due to
missing class name, this leads also to an incomplete call tree.
The use of overloaded functions with equal names but different
parameters in C++ programs may lead to incorrect calling
relationships. A variable initialization with parameters will be
misinterpreted as a function call. A correct handling of these
and other C++ features requires a complete C++ source code
analyser to keep track of the class functions belong to and the
different calling parameters.
If precise informations about C++ code are needed, utilities like
'class hierarchy browsers' or 'class viewers', which are usually
(or should be) part of C++ compiler environments, should be used
instead.
Because of the above described reasons, some care should be taken
if C++ code is processed and displayed.
- 12 -
3.3 DBASE SOURCE CODE
DFT can process source code which is based on the DBASE III/IV
programming language. This means that also source code written in
DBASE derivatives like CLIPPER or FOXBASE can be analysed. The
source code analyser tries to be as correct as possible to build
a reliable hierarchy tree. A function/procedure declaration is
recognised by the FUNCTION resp. PROCEDURE keyword. A
function/procedure call is recognised by the following
statements:
function()
CALL function
CALL function WITH parameters
DO function
DO function WITH parameters
If a file contains no function/procedure declaration, the
filename itself is taken as procedure name. All tokens are
assumed case-insensitive and are converted to upper-case
characters.
3.4 FORTRAN SOURCE CODE
FFT can process source which is based on the FORTRAN 77 standard.
Each FORTRAN line is divided into fields for the required
information, each column represents a single character.
COLUMN FIELD
1 comment indicator (C,c,*,!)
1-5 label
6 indicator for line continuation
7-72 statement field (optionally up to column 132)
Continuation lines are merged before they are analysed. The
number of continuation lines is 19 by default and can be varied
between 0 and 99 (option -qn). The standard intrinsic functions
and additionally VAX-FORTRAN intrinsic functions are recognised.
All tokens are assumed case-insensitive and are converted to
upper-case characters. If option -I is set, INCLUDE statements
are recognised and handled. Two different types of include
statements are accepted:
C TYPE 1: FORTRAN LIKE SYNTAX, INCLUDE STATEMENT STARTS IN
C COLUMN 7, FILENAME IN SINGLE QUOTATION MARKS
INCLUDE 'FILENAME'
C TYPE 2: C LIKE SYNTAX, INCLUDE STATEMENT STARTS IN
C COLUMN 1 WITH #, FILENAME IN DOUBLE QUOTATION MARKS
#INCLUDE "FILENAME"
- 13 -
The resulting function call graph may be incorrect due to the
ENTRY capability of FORTRAN which allows direct jumps into a
function/subroutine body. This may result in incorrect
relationships for the ENTRY statement and the surrounding
function/subroutine.
3.5 LISP SOURCE CODE
LFT can process LISP and SCHEME source code. The development of
LFT was mainly based on the GNU-EMACS LISP dialect as it is used
in the GNU-EMACS macro extension language and its functionality
was tested mainly with these macro files. LISP functions/macros
are recognised by the DEFUN and DEFMACRO keywords. SCHEME
functions are recognised by the DEFINE keyword, SCHEME processing
is enabled by option -XSCHEME. Unnamed functions declared with
the LAMBDA keyword can be recognised optionally (option
-XLAMBDA). Tokens are assumed case-sensitive. Comments are
recognised for ';' until end-of-line and between '#|' and '|#' as
multi line comment blocks. The source code analysis is performed
in two passes. The first pass detects function/macro declarations
and the second pass analyses the relationships. Function calls
via (funcall <fcn>), (function <fcn>), (apply <fcn>), (mapc
<fcn>) and similar constructs may not be correctly evaluated if
fcn is a function-symbol (e.g. given as a function parameter) and
not a valid function name.
LFT was designed to work with different types of LISP source code
(as there are XLISP, CLOS, GNU-EMACS LISP, ...), although the
large number of dialects may lead sometimes to unexpected
problems.
3.6 ASSEMBLER SOURCE CODE
As an additional feature, CFT and FFT can process assembler
source code for the Intel 80x86 processors (MASM 5.1, TASM) and
for the Intel 80960 RISC processors (or any other "AT&T UNIX-like
assembler" like GNU) to get information about assembler
procedures and functions being called from the assembler source
files. The assembler source code scanner also detects and handles
calls of include files. This feature is useful for mixed language
programming. The processing of assembler macros, however, is not
supported, the preprocessing option (-P) works only with C source
code. Assembler source files are recognised by their file
extensions '.ASM' and '.S', there is no other way to force a file
being processed as an assembler file.
The following naming convention is used: For '.ASM' assembler
files (MASM, TASM) all identifiers are treated case-insensitive
and will be transformed to lower case characters, but identifiers
in '.S' (GNU, I960) assembler files are treated case-sensitive.
This means, that an assembler function 'func1' defined in an
'.ASM' file can be called from the source by 'func1', 'FUNC1',
'Func1' or any other lower and upper case character combination.
If 'func1' is defined in an '.S' file, the name must match
- 14 -
exactly. The first leading underscore of a function name will be
removed to get exact naming matches. Type modifiers in C source
code like 'cdecl' or 'pascal' will not be considered. Remember
these conventions when processing C/FORTRAN and assembler files.
Assembler code statements (inline code) inside C source code will
not be processed and will be skipped, because it is too difficult
to handle the several kinds of syntax being used for this like
'asm ...', 'asm "..."' or 'asm(...)' and the different keywords
('asm', '_asm', '__asm', '__asm__', ...) used by various compiler
implementations.
- 15 -
5 DATABASE GENERATION
One of the most important features provided by CFT and CST is the
database generation which can be enabled with the -G option. It
is performed after writing the output file to save all
informations about the processed files in a set of dBASE
compatible database files (extension '.DBF') for later use. These
database files contain all necessary informations like function
or data type names, the location where they are defined, their
caller/callee relationship, all scanned files with statistic
informations, include files and so on. It was tried to store the
informations in the most compact and effective database structure
to save disk space. Note that if the contents of the database
files is manipulated by external tools like dBASE or something
else, the internal consistency will be corrupted and wrong or
unexpected results will happen!
The database can be used to recall informations, for example to
find out, if and in which file and on which line a specific
function or data type is defined. A previously generated database
can be read into CFT and CST (option -g) to add new files to it
and/or to produce another output file with new configuration
options, for example with the reverse call tree or only with a
special selected item of interest to be displayed. Such an
incremental database generation is also useful if large projects
can be divided into a set of commonly used files and project
specific files. A good example for this is the GNU C compiler,
which consists of a set of language independent files and three
language dependent file sets for C, C++ and Objective-C. To
analyse this software with CFT or CST, the language independent
part can be stored into a database which is later reused for the
language dependent parts to build the complete set of
informations.
The ability to retrieve informations about the sources from the
database is quite useful in many cases. Recalling informations
from a database is much faster than processing all the sources
again to find a specific item of interest. The documentation and
maintenance of large software projects is much more effective and
easier to do if the developer has a tool to navigate through the
source code and that helps him in his comprehension of the
program and its internal structure. It is also useful for reverse
engineering of source code to get an overview of the internal
program structure. Together with user programmable editors it is
possible to offer the user a source code browser with a hypertext
like feeling by integrating database recalling functions into the
editors.
Two utility programs, called CFTN and CSTN to, retrieve
informations from databases, are available with supporting macros
for their integration into the BRIEF, QEDIT or MicroEMACS editor,
which are described in another section later in this manual.
- 16 -
6 PROGRAM LIMITATIONS
First of all, CFT and CST cannot replace a compiler or a syntax
checker like 'LINT' to detect errors in the source code. This
means that it should be possible to compile the source code
without fatal errors before it is possible to analyse it with CFT
and CST, otherwise the processing results may be incorrect (and
may be the system crashes ...).
However, there are some situations where CFT and CST can be
useful to detect bugs and inconsistencies in the source code like
- multiple definitions of functions or data types,
- different function return types,
- implicit declared functions with no prototype,
- function definitions used as prototype,
- recursive, nested, hidden and frequent calls of include
files,
- unclosed strings or character constants,
- nested comments,
- misbalanced braces,
- unexpected end-of-file characters inside files,
- illegal characters in the source code,
- wrong number of macro arguments,
- missing macro arguments,
- misbalanced '#if...' control blocks.
These code checks are done on multiple files in multiple
directories so that inconsistencies between different files can
be found and displayed. This is a capability which conventional
compilers working only on a single file at a time cannot provide
and will miss therefore (maybe the linker will find some of these
inconsistencies).
Some statistical informations about the source code may not be
correct if preprocessing is enabled (-P). This affects all
options which do statistics like the -p or -s option. The size of
the 'pure' source code may not be correct due to macro expansion
or removing of unnecessary blanks. However, the file size is
always correct because it will be taken from the source file.
Most of the program limitations are caused by the limited
available memory. This means that the more conventional main
memory you have, the better it is. The real mode versions of CFT
and CST do not use expanded or extended memory, no virtual memory
management or disk file swapping, so keep your conventional
memory free of memory consuming TSR programs and other utilities
if you want to process a large number of files. The use of
operating systems like MS-DOS 5.0 or DR-DOS 6.0 and memory
managers like QEMM or 386MAX to get more free conventional memory
may help to handle big applications with a large number of files.
If memory problems still occur during processing, there is an
easy way to break the memory limits: use the 32 bit protected
mode versions of CFT and CST, called CFT386 and CST386. These
programs are running in protected mode and so they have no memory
limitations and are faster than the real mode versions.
- 17 -
The number and the sizes of files to be processed is nearly
unlimited with 2^14 files and 2^31 bytes maximum file length.
Each file can have 2^16 lines. The number of functions and data
types being handled is limited to 2^14. Note that these values
are given for the real mode versions, the protected mode versions
exceed them. These limitations should be enough even for the
biggest projects that could be mentioned.
The calling of nested include files is limited by the number of
files which can be opened simultaneously (operating system resp.
compiler dependent). The ISO/ANSI C minimum for include file
nesting levels is 8, this demand will be fulfilled by CFT and
CST.
The integrated C-preprocessor limits the size of expanded macros
to 6 Kbytes. The number of macros simultaneously defined is
unlimited (ISO/ANSI: 1024) and only affected by the available
memory. The number of macro parameters is limited to 31
(ISO/ANSI: 31) and there are up to 31 significant characters
(ISO/ANSI: 31) recognised. The conditional compilation nesting
levels of '#if...' control blocks is limited to 32 (ISO/ANSI: 8).
The line length is unlimited (ISO/ANSI: logical line length is
509 characters). The number of characters in a string (including
'\0') is 2048 (ISO/ANSI: 509). The number of members in one
structure/union is unlimited (ISO/ANSI: 127), the number of
structure/union nesting levels is unlimited (ISO/ANSI: 15).
The recognition of identifiers like function and variable names
follows the standard rules: an identifier consists of upper and
lower case letters (A-Z, a-z), underscore (_) and digits (0-9),
additionally the dollar sign ($) will be accepted. National
character set extensions as they are usual for languages in
european countries like Germany, Denmark or Sweden can be defined
with option -J.
C++ comments '//...' are usually only recognised if option -C++
is set. However, to accept the non-standard extension of some
compilers which allow such comments also in C source code, option
-// can be used therefore. Nested C style comments '/*...*/' are
not allowed and will always produce warnings.
The calculation depth of the critical function call path or
structure nesting level is unlimited. The calculation is an
extremely recursive function and was successfully tested up to
115 nesting levels. It is not known from which nesting level on
stack overflow will happen.
CFT cannot recognise and reference a function if it is used with
its pure name, without parentheses. This happens if a function
name is assigned to a function pointer variable or used as a
function pointer argument in a function call. Indirect calls to a
function via a function pointer cannot be resolved. CFT will be
confused in some rare cases by extensive type-casting operations
like 'void __based(void) * __cdecl ... ()' and will display
- 18 -
unexpected messages. A function prototype declaration inside a
function block ('function given scope') will not be recognised by
CFT. In assembler source code, some definitions of local
variables seem to look like a function or a label definition and
are treated by CFT like that although this may be wrong in some
cases. It is also not always possible to detect a call of a local
label correctly. CFT sometimes displays warning messages about
'return type mismatch' though this may be correct in that special
case because the different types are earlier defined by a
'typedef' declaration. The reason is simply that CFT doesn't
recognise these 'typedef's (but CST does!), it looks only for
function names.
An often requested feature for CST is the integration of the
calculation of structure/union sizes with byte offset
informations for every structure/union member. This feature is
not implemented in CST although it would be possible to do this
because all necessary informations are present. The reason is
that there would be too much overhead for CST to treat the
various compiler implementations with their different basic type
sizes (sizeof(int), sizeof(long double)) for different processor
types (16 bit, 32 bit, 64 bit, ...) and data type alignment
requirements (by default and also controlled with #pragma's like
'align' or 'pack'). It would be possible to do this for just one
selected compiler implementation or processor type but not for a
great number of them. Especially compilers for advanced
architectures like RISC processors have very complicated type
alignments rules depending on the data types, alignment pragmas,
compiler switches, type sizes, available register number and
register sizes and resulting structure/union/class sizes to
generate highly optimised code. This includes usually the
insertion of 'fill' bytes inside a structure/union and sometimes
'padding bytes' at the end of a structure/union to force aligned
sizes on specific byte boundaries (For examples see the reference
manual of the Intel 80960 C-Compiler iC960, release 3.0). Because
of these reasons, an integrated 'byte offset calculation' is not
implemented in CST. Instead, you can generate a source file for
selected data types with option -O, that performs these
calculations, if you compile the generated file with your C
compiler. For further informations see the description for option
-O.
SUMMARY
The above described limitations can lead in some situations to
misinterpretations or loss of informations of the scanned source
code. The only way to avoid these lacks would be the inclusion of
parts of a 'real compiler' to handle the complete C and C++
syntax in any possible situation. But this was not the intention
when the development of these programs as 'little' and easy to
use general purpose programming supporting tools began. Although
I hope that CFT, CST and the other SXT programs will in most
cases be powerful and useful development and documentation tools!
- 19 -
7 IMPROVING EXECUTION SPEED
CFT and CST are disk storage based programs because the source
and include files, the intermediate precompiler file and the
output file must be read from and written to hard disk. This
means that the execution speed of CFT and CST depends at first on
the speed of the physical storage medium and not (only) on the
speed of the CPU. There are several ways to improve the program
performance:
- install a RAM-disk and
a) start CFT and CST from there so that the intermediate
file and the resulting output file will be stored there
(but don't forget to copy the output file to the hard
disk before power-off), or
b) use the -v option to redirect only the precompiler
output file (scanner input file) to the RAM-disk from
anywhere the program is started (the RAM-disk must be
large enough to hold the largest possible temporary
file, otherwise a disk-write error will occur),
- use a hard disk cache program like SmartDrive, HyperDisk or
PC-Cache,
- use a faster hard disk,
- and finally, of course, use a faster and more powerful CPU.
The most effective combination is option -v with a RAM-disk as
destination path and hard disk caching together with a fast hard
disk drive. If the disk cache is large enough to hold most of the
frequently called include files, the execution speed is about 2.5
to 3 times faster than without. This is a significant speed-up
especially for projects with a large number of files and many
included files in each source file.
During program execution with preprocessing (option -P), most of
the time will be consumed to preprocess the given input files and
the related include files and to generate the preprocessor output
file. The scanning for functions (CFT) or data types (CST) takes
only a small amount of time. The function/data type relations are
computed while the output is generated and written to disk, there
is no precomputing necessary.
The function for critical call path/nesting level detection
depends only on the number of functions or structures and not on
the call/declaration nesting complexity. The execution time grows
linear with the number of items (functions/structures) to process
and is very fast!
Be aware of the fact that the processing of a large number of
files can take quite a long time (from several minutes up to
hours on lower performance machines!), especially if option -P
for preprocessing is enabled.
The generation of the output file and writing to disk can also
take some time if the number of items to display is large and the
nesting structure is complex or if there is no cross reference
option enabled (see -x and -r for further information). If the
- 20 -
number of items is very large, one of the most time consuming
options is the function/data type file reference (option -z). The
writing and reading of the database files (options -G and -g)
takes also some time due to the large number of different
informations.
Don't panic if there seems to be no disk access for a longer
time, the reason is just that there may be time consuming
computations and that the output will be buffered internally to
reduce the number of disk accesses and therefore speed up the
output!
For more detailed informations about the program efficiency see
appendix 3.
- 21 -
8 COMMAND LINE SYNTAX DESCRIPTION
The SXT programs are command-line driven. This section gives a
complete overview about all command line options and their
syntax. It gives also remarks for their use and shows several
examples with detailed descriptions. The command line options are
case-sensitive! There are no differences between the real mode
and the other versions of the SXT programs. For every option the
SXT programs which support it are listed in parentheses. This
section of the documentation should be read very careful by all
users to get a complete overview about all the features which are
provided.
THE OPTIONS ARE LISTED IN LEXICOGRAPHICAL ORDER.
NONE OF THE OPTIONS IS SET BY DEFAULT.
SYNTAX: CFT [options [$cmdfile]] <[+]file> <@filelist>
CST [options [$cmdfile]] <[+]file> <@filelist>
DFT [options [$cmdfile]] <[+]file> <@filelist>
FFT [options [$cmdfile]] <[+]file> <@filelist>
LFT [options [$cmdfile]] <[+]file> <@filelist>
OPTIONS: (valid for)
-Bsizes (CFT, CST)
Redefine the basic type sizes and pointer type sizes (all values
must be declared in bytes) for conditional preprocessor controls
with the 'sizeof()' keyword like '#if sizeof(int) == 4'. This
option is only valid with the -P option.
The required format for this option is
-Bv,c,s,i,l,f,d,ld*data,code
|
(delimiter between data and pointer sizes is '*')
with the following types and their respective default data size
values in bytes (the pointer type sizes are model dependent):
v : void (sizeof(void) is usually 0, but for GNU-C it is 1)
c : char (1 byte)
s : short (by definition 2 bytes, hardware independent)
i : integer (hardware dependent, 2 or 4 bytes)
l : long (4 bytes)
f : float (4 bytes, IEEE format)
d : double (8 bytes, IEEE format)
ld : long double (10 bytes, IEEE format, some compilers
assume long double == double (= 8 bytes), some CPU's
and their compilers have special alignment requirements
like the Intel 80960, where sizeof(long double) is 16
bytes due to register and memory access requirements
and structure alignment)
data : data pointer (type pointers, 2 or 4 bytes, memory model
dependent)
- 22 -
code : code pointer (function pointers, 2 or 4 bytes, memory
model dependent)
The sizes of signed and unsigned types of the same basic types
are considered equal, this means that, for example, the following
expression is true:
sizeof(unsigned int) == sizeof(signed int) == sizeof(int)
The sizes of type pointers to data and function pointers to code
are also considered equal, this means that, for example, the
following expressions are true:
sizeof(int *) == sizeof(float *)
sizeof(int (*)()) == sizeof(float (*)())
A 64 bit (8 bytes) integer type like 'long long' or 'bigint' (or
something else) is not supported because there are no C compilers
known to me which use such a type although some (co-)processors
and their assemblers are able to handle it (see Intel 80960
assembler manual for examples).
If the -B option is not set, the default values for the various
memory models and compiler types (as they are known to me) are
used, the assumed target hardware has an Intel 80x86
microprocessor. Note that during preprocessing type modificators
like "near" or "far" are not recognised.
If the -B and the -T options are not set, the sizes of data
pointers and code pointers are always considered equal:
sizeof(int *) == sizeof(int (*)()) (= 4, large model)
For example, -B0,1,2,2,4,4,8,10*4,4 would be the correct
declaration for MS-C 7.0, large/huge memory model, with the
values for data types (void = 0, char = 1, short = 2, int = 2,
long = 4, float = 4, double = 8 and long double = 10 bytes) and
pointers to data types and function pointers (all values 4
bytes). These values are set automatically by defining -TMSC70,L
(or -TMSC70,H) as compiler type and memory model description for
preprocessing.
-C++ (CFT, CST)
Enable C++ source code processing. This includes the handling of
C++ comments '//...', the recognition of C++ keywords and the
definition of the macro name '__cplusplus' for preprocessing. If
a supported compiler defines additional macro names like
'__TCPLUSPLUS__' for Turbo-C they will also be defined before
preprocessing. Option -C++ is strictly required to process C++
code correct.
-C[s] (CFT, CST, DFT, FFT, LFT)
List the function/data type contents for every processed file,
's' sorts by line numbers (DEFAULT ORDER: lexicographical). There
are additional informations possible with the option -s. CFT
informs if none of the functions defined in a file is called from
- 23 -
functions defined in other files (internal versus external
linkage). Functions for which no external caller outside the file
is found will be marked [INTERNAL], such functions are candidates
for defining them as 'static'. Attention: Calling a function by a
function pointer won't be noticed! This information is useful to
find out whether the contents of a file is unnecessary for the
project so that the file must not be linked. This option gives
useful informations about source code metrics for every defined
function.
-D[..] (CFT, CST, DFT, FFT, LFT)
Specifies macro name(s) (-Dname or -Dname1=name2) or file with
macro names (-D@namelist) of functions/data types which should be
predefined and linked together, also used as preprocessor define
if the integrated preprocessor is called (-P). The defined names
are case sensitive and trigraph translation is performed on them.
The definition of a string as replacement for a macro name is
different on the command line and inside a macro definition file
or command file (marked with '$'). On the command line, the
double quotation marks must be 'escaped' and the string must be
quoted like '-DXYZ="\"123\""' (similar to C strings) to work
correctly, the reason is the DOS wildcard expansion of the
command line. Inside a macro definition or command file, the
double quotation marks need not be 'escaped', so the definition
can be written like '-DXYZ="123"'. This option cannot be used in
environment defines if the equal sign '=' is used because this
produces a syntax error for DOS when trying to store a 'SET=...'
command with a second equal sign in one line. If a define item
consists of two words see the notes at option -S for a
description. Keep these differences and exceptions in mind to
avoid unexpected results using the -D option.
-Ename (CFT, CST, FFT)
Almost the same as -I, but the path for the include files will be
taken from the environment variable 'name'. Typing -EINCLUDE
would produce the same results as -I alone.
-E[..] (LFT)
Specifies name(s) (-Ename) or file with names (-E@namelist) of
external (builtin) functions. Useful if GNU-Emacs Lisp source
code is scanned to reduce the number of undefined functions
listed in the output file. A list of GNU-EMACS (version 18.59)
builtin functions is given with the file GNULISP.FCT.
-F (CFT, CST, DFT, FFT, LFT)
Use only ASCII characters for the tree chart output instead of
the DEFAULT semigraphic characters. This option is useful if the
generated output file should be printed on a printer which does
not support semigraphic characters like they are defined in the
IBM character set. It can also be used to prepare the output file
for use in a WINDOWS application like MicroEMACS if there is no
font with semigraphics available.
- 24 -
-G[name] (CFT, CST, DFT, FFT, LFT)
Generate a database with the complete set of informations about
the processed sources. The additional parameter 'name' (path and
file name) is used as an unique base name for the set of database
files (up to 6 significant characters), the DEFAULT NAME 'CXT' is
used if no name is specified. If 'name' ends with a (back-)slash,
it is used as a pathname. The generated database files (extension
'.DBF') are dBASE compatible. There are two additional files
created, one with the command line options (extension '.CMD') and
one with a list of the source files (extension '.SRC') being use
for database generation. They can be used as command line
definition files with '$' (command list) and '@' (file list).
As a result of the database generation you will find files named
'CXTxy.ext' (default name 'CXT') respectively 'namexy.ext' (user
defined 'name'), where 'x' will be 'F' for CFT or 'S' for CST and
'y' is replaced by an internally used character to mark the
different database files and their contents.
-H[elp] (CFT, CST, DFT, FFT, LFT)
See option -?.
-I[path] (CFT, CST, FFT)
This option enables the scanning of include files declared with
'#include "..."' or '#include <...>' or with a similar syntax for
FORTRAN. The required path for the include files is taken from
the INCLUDE environment variable (DEFAULT BEHAVIOUR) or can be
user defined by 'path'. Paths defined with -I will be searched
before any other paths taken from environment variables specified
by -E or -P, so care should be taken with that option. Include
paths can be given either absolute or relative. A relative path
is always considered relative to the directory of the source file
it is used with, not to the directory the analysis is started
from or the analysis program is located. Specifying -I* ignores
missing include files during preprocessing (-P). This is a 'quick
and dirty' approach, but can sometimes be useful, if include
interrelations or locations are unknown. However, the results may
not always be correct.
Using the -I or -E option without -P allows the scanning of the
source file and the included files without preprocessing. In that
case an include file is handled as if it were a complete new
file, this can lead to errors if a file inclusion is specified
within a function or structure. Also preprocessor controls like
'#if ...' are not evaluated and can lead to unexpected results.
-Jcharset (CFT, CST, DFT, FFT, LFT)
Extend the C/C++ character set (a-z, A-Z, 0-9, _, $), which is
used by DEFAULT, for identifier recognition with a user defined
character set 'charset'. This option allows the programmer to use
national character sets as they are common in Germany, Denmark,
Sweden and other european countries. All characters must be
specified within one -J option.
- 25 -
-L[L][+] (CFT, CST, DFT, FFT, LFT)
Redirect the screen output to a file, called 'CFT.LOG' resp.
'CST.LOG'. If '+' is set, the output is both written to screen
and redirected to the log file so that the output messages can
both be viewed as they appear and later analysed. Finally, -LL
resp. -LL+ appends the output to an existing file, this can be
useful if CFT and CST run in batch jobs.
-M (CFT, CST, FFT)
This option generates a source file/include file dependency table
for every processed file. This table shows the dependent include
files of a source file and can be used for a MAKE file. It is
also useful to check if the included files are taken from the
correct directories. If a file is included more than once, the
number of inclusions will be displayed.
-N (CFT, CST, DFT, FFT, LFT)
Disable the writing of an output file. This option can be useful
if, for example, only a database (option -G) should be generated
with CFT or CST and no output file is required. In that case the
sometimes very time consuming process of output file writing is
skipped. Note that for CST the writing of the byte offset file
"CST_OFFS.C" will not be affected by this option.
-O[..] (CST)
Specifies name(s) (-Oname) or file with names (-O@namelist) of
data types for which the calculation of structure/union sizes
with byte offset informations for every data type member should
be performed. Additionally specifying -O+ sets a flag for the
recursive collection of sub-structures during expansion which are
displayed without specifying them by -O. This means that if a
structure/union consists of members which are also structures or
unions (and so on), it is not necessary to specify all these data
type names with -O to enable them for byte offset calculation.
Instead, you have to specify only the top most data type with
-Oname and additionally -O+ to force CST to select all related
sub-types for displaying. If -O+ is set but NO names are
specified, ALL structures and unions will be used for byte offset
calculations!
As the result of this option, CST generates a C source file,
called 'CST_OFFS.C'. This file needs some additional editing to
declare necessary include files, data types, defines or pragmas
before it can be compiled with the C compiler for which the file
was generated (be sure to use the same includes!). The resulting
executable prints for every structure/union member the byte
offset relative to the beginning of the structure/union (decimal
and hexadecimal) and the size of each member, the resulting
structure/union size and also informations whether a
structure/union member has been aligned (= compiler dependent
insertion of fill bytes before that member) or if the
structure/union was padded with fill bytes at the end of it to
align the size to a specific length.
- 26 -
To get these informations and to perform the necessary
calculations therefore, the source file 'CST_OFFS.C' can become
very large and makes use of the C macro programming capabilities,
which may lead in some rare cases to errors during the
compilation due to the internal limitations of some C compilers.
The -O option is very useful if you need detailed informations
about structures/unions in case of error searching and debugging,
especially for hardware debugging with an ICE. It is also useful
for finding out the differences in the internal layout of a
structure/union in the case of porting C source code between
different compilers and/or operating systems or if data
structures are exchanged between different hardware platforms,
for example with data communication. You can verify if the
expected structure/union layout and size is really produced by
the target compiler.
-P[name] (CFT, CST)
Run the integrated C preprocessor before the file scan. In this
case the include path is taken from the INCLUDE environment
variable (DEFAULT BEHAVIOUR), from the user defined 'name'
environment and additional paths from -I and -E option are used.
If special paths should be searched before the default paths,
they must be specified by the -I path or the -E environment
option and they must be placed on the command line before the -P
option to be processed first. The -D, -U preprocessor defines and
-T type and memory model and -B size infos are also used, if
defined. The path for the preprocessor output file can be
specified by the -v option, otherwise the current working
directory will be used (DEFAULT BEHAVIOUR). The comments in the
source and included files will remain until -q is defined to
remove them. The comments are used for statistics with option -p.
If option -C++ is set, the macro '__cplusplus' will be predefined
before preprocessing to enable C++ macros and C++ comment
recognition.
If you are using a compiler which is not supported by CFT and CST
or the build-in preprocessing doesn't satisfy your needs because
the results seem to be different from your preprocessor, you can
preprocess the files you want to analyse with your own compiler
preprocessor and use these preprocessed files as input for CFT
and CST.
-R (CFT, CST, DFT, FFT, LFT)
By default, CFT and CST generate the hierarchy tree chart of the
called function/data type ("CALLER:CALLEE relation", "WHO CALLES
WHOM"). The -R option produces an inverted listing showing the
callers/users of each function/data type. It generates the output
as the function/data type hierarchy member list tree chart in
reverse order as a list of calling items of the referenced basic
item ("CALLEE:CALLER relation", "WHO IS CALLED BY WHOM"). This
option is useful to get the relations between functions/data
types and their callers/users.
- 27 -
-RATIONAL (CFT, CST, DFT, FFT, LFT)
This option generates a so called 'Petal' file for Rational Rose
2.0 for MS-Windows 3.1, a CASE-tool supporting the Booch
Object-Oriented Analysis and Design (OOAD) method. The generated
output file can be imported by Rational Rose to use the builtin
capabilities for describing and visualizing Finite State Machines
(FSM), but in this case (mis-)used to graphically visualize the
calling relationships of functions resp. data types. If you have
Rational Rose 2.0, you have to perform the following steps to get
impressive results: Start Rational Rose and select a new model
('File' - 'New') and import the generated file ('File' -
'Import...'). If successful, a class diagram with one class
symbol named 'CallGraph' appears. Click on that symbol and choose
'Browse' - 'State Diagram'. In the state diagram select 'Tools' -
'Layout' to start the layout optimization function. As the result
the graphical call tree of the source code analysis is displayed
with each function/data type shown as a circle ('state') and the
call relationship shown as an arrow ('transaction') from the
calling to the called item, for classes from the superclass to
the subclass. You can zoom into the diagram, print the results or
incorporate the diagrams into your program documentation via
Clipboard, e.g. into MS-Word-for-Windows.
This option is available for all SXT programs. The generated
files are named 'CFT.PTL', 'CST.PTL', 'DFT.PTL' and so on. CST
generates an additional file named 'CSTCLASS.PTL' describing the
class inheritance relationships. The -RATIONAL option is a
work-around for the missing graphical layout capabilities of the
SXT programs (which some users have requested in the past) by
using an external program for doing the missing features. This
option was tested with Rational Rose 2.0 Beta for MS-Windows 3.1.
Note that Rational Rose needs even for small and medium sized
projects some time to import the file and process the FSM layout.
-S[..] (CFT, CST, DFT, FFT, LFT)
Specify name (-Sname) or file with names (-S@namelist) of
functions/data types to search for and to dump if present, names
are case sensitive. These items are listed first in the output
tree chart file. By using -S on the command line, it is necessary
to surround a data type name that consists of two words with
double quotation marks like "struct _iobuf" to connect the two
words. This is not necessary inside a list file, but there every
search name must be on a separate line.
-Tn (FFT)
Set the tabulator expansion size to 'n' (DEFAULT: 8 characters).
-Ttype,m (CFT, CST)
Use this option to set the compiler type for source code
preprocessing to one of the following types:
MSC51 Microsoft C 5.1
MSC70 Microsoft C/C++ 7.0
MSVCWNT Microsoft VC++ 1.0 for Windows NT
TC10 Borland Turbo C++ 1.0
BC20 Borland C++ 2.0
- 28 -
BC31 Borland C++ 3.1
BC10OS2 Borland C++ 1.0 for OS/2
GNU GNU-C
I960 Intel 80960 iC960 3.0
The supported memory models are T(iny) (valid only for MSC70,
TC10, BC20, BC31), S(mall), M(edium), C(ompact), L(arge), H(uge),
'L' is assumed as default if no model is specified. MS VC++ for
Windows NT, Borland C++ for OS/2, GNU-C and Intel iC960 do not
need a memory model because they compile really 32 bit code. The
Intel iC960 compiler requires the definition of the 80960 RISC
processor architecture which is one of KA, KB, SA, SB, MC, CA
(default is KB).
This option causes several compiler dependent preprocessor macros
(if they were known to me, however) to be defined before
preprocessing starts. This option can only be used with the -P
option, otherwise it has no effect.
If your compiler is not supported, you can perform the following
steps: Find out which preprocessor defines are necessary (manual,
help file) and declare them with option -D, then declare,
depending on the selected memory model or processor architecture,
the type sizes with option -B.
-U[..] (CFT, CST)
Specifies a predefined macro name (-Dname) or file with
predefined macro names (-U@namelist) to be undefined for
preprocessing. Note that the default predefined macro names
'__FILE__', '__LINE__', '__DATE__', '__TIME__' cannot be
undefined. All other predefined names for the various compiler
types can be undefined. Like for -D, the names are considered
case-sensitive, but trigraph translation is not performed because
the internal representation cannot contain trigraphs.
-V (CFT)
List prototyped functions which are neither called nor defined
(option -a and -u). This option is useful to find unused function
prototypes which could be removed from the source code.
-Wlevel (CFT, CST, DFT, FFT, LFT)
Set error and warning message level. Higher warning levels
include lower ones. The DEFAULT level is always the highest
supported warning level: Possible levels are:
0 : all error and warning messages are suppressed except
absolutely catastrophic fatal errors,
1 : display serious errors or warnings,
2 : includes level 1 plus additional errors and warnings,
3 : includes level 2 plus errors/warnings/remarks,
4 : includes level 3 plus warnings about implicit declared
functions and lacks of type or storage class.
- 29 -
The following levels affect only preprocessing (CFT and CST):
5 : includes level 4 plus warnings and errors during
preprocessing (non-fatal errors and warnings during
preprocessing are otherwise not displayed, preprocessor is
running in 'silent mode'),
6 : includes level 5 plus remarks/slight warnings during
preprocessing.
The output format for messages during file scan is
file name(line): error: description
file name(line): warning: description
and during preprocessing (warning levels 5 and 6)
preprocessor: file name(line): error: description
source line
preprocessor: file name(line): warning: description
source line
-X (CFT, CST, DFT, FFT, LFT)
Assume a UNIX-style text file: no CR, only LF. The DEFAULT
ASSUMPTION is a DOS-style text file with CR+LF. Any other
combination like CR in UNIX-files, CR without following LF or LF
without preceding CR in DOS-files will cause a warning message.
This option is useful to detect possible conversion errors
between different operating systems or incorrect editor
configuration settings.
-XLAMBDA (LFT)
Recognize the LISP resp. SCHEME keyword 'lambda' for unnamed
function declarations. By DEFAULT, 'lambda' is treated as a
simple identifier.
-XSCHEME (LFT)
Assume SCHEME source code instead of LISP source code (DEFAULT).
This means that functions are recognised by the 'define' SCHEME
keyword instead of the 'defun' resp. 'defmacro' LISP keywords.
-Y (CFT, CST, DFT, FFT, LFT)
Ignore CR+LF checks. This option disables all checks which are
done for unexpected CR+LF combinations in DOS or UNIX files. If
option -Y is set, option -X will be ignored. This option can be
useful if there would be too many messages concerning that error
or if this message would be of no interest for the user.
-Z[s] (CFT, CST, DFT, FFT, LFT)
Display every caller and member for each function/data type, 's'
sorts by the number of calls (DEFAULT ORDER: lexicographical),
this is an extension of the -c option. This option shows the
relations in the following form:
List of parent functions/data types:
1. caller (reference #) <# of calls from>
...
- 30 -
n. caller ...
function/data type (reference #) <# of calls from parents, # of
calls to children>
List of child functions/data types:
1. called member (reference #) <# of calls to>
...
m. called member ...
This compact form lists all callers and members with the number
of their calls, recursions are detected and displayed.
-a (CFT, CST, DFT, FFT, LFT)
List every function/data type, also previously referenced
functions/data types. This generates a complete list of every
function/data type in lexicographical order with references to
their first location.
-b (CST)
Display the C++ class inheritance relationships. This option
generates two listings. The first one displays the complete C++
class hierarchy graph(s). The second one shows for each class
first the superclasses from which the class inherits and the
access restrictions (public, protected, virtual, ...) and second
the subclasses which inherit from the given class, also with
access restrictions. This option is useful to find out things
like the class dependencies or multiple inheritance.
-cmdline (CFT, CST, DFT, FFT, LFT)
Print the command line options at the beginning of the output
file as a remark for the generation rules of that output file.
The contents of commandlist and filelist files is indented after
the listfile name.
-c[s] (CFT, CST, DFT, FFT, LFT)
Display the number of calls to each function/data type, 's' sorts
by the number of calls (DEFAULT ORDER: lexicographical). Useful
to find out which functions/data types are never called/used
(maybe unnecessary and deletable) and which ones are the most
frequently called/used (together with profiler results a subject
for further optimization efforts).
-dn (CFT, CST, DFT, FFT, LFT)
Set the maximum function/structure/union nesting level for output
generation to 'n' (DEFAULT: maximum value n = 999). This means
that the request for displaying a deeper level will be rejected
and the output tree chart will be truncated at the given level.
-e[char] (CFT, CST, DFT, FFT, LFT)
Generate formatted ASCII text files with function/data type list
and file list. All entries are separated by the optional 'char'
character, if 'char ' is not defined, the tabulator character is
used as DEFAULT SEAPRATOR. If spaces are wanted as separating
characters, you have to write -e" ". Such prepared files can be
used directly as input to other programs like word processors
- 31 -
(e.g. MS-WORD for WINDOWS) or spreadsheet calculators (e.g.
MS-EXCEL), for example for documentation purposes. The following
files are created:
CFTITEMS.TXT:
Contents: function name, return type, file name, line #, total #
of function bytes, # of function comment bytes, # of function
lines, # of control statements, # of brace levels
CSTITEMS.TXT:
Contents: data type name, file name line #
CFTFILES.TXT and CSTFILES.TXT:
Contents: file name, # of lines, file size in bytes, # of comment
bytes, # of functions/data types
-f (CFT, CST, DFT, FFT, LFT)
Generate an output list in short form, only with the
function/data type names, no further description of the internal
function/data type elements.
-g[name] (CFT, CST, DFT, FFT, LFT)
Read a previously generated database (see option -G). The
additional parameter 'name' (path and file name) is used as an
unique base name for the set of database files (up to 6
significant characters), the DEFAULT NAME 'CXT' is used if no
name is specified. If 'name' ends with a (back-)slash, it is used
as a pathname. Every source file will be tested for changes of
file creation time and file size and a warning message will be
given to inform the user.
-h[elp] (CFT, CST, DFT, FFT, LFT)
See option -?.
-iname (CFT, DFT, FFT, LFT)
Ignore function member 'name' in output tree chart. It will not
be displayed and will be skipped instead if found as a function
member. This option can be useful if, for example, functions are
used only for test purposes and are of no further interest for
the user and should be ignored in the output tree chart.
-l (CFT, DFT, FFT, LFT)
List a function only once in case of repeated consecutive calls
(DEFAULT: display every occurence). If a function is called more
than one time inside a function without any other call in
between, there will be only one reference of that function call
in the output tree chart. This option results in shorter output
files.
-mtype (CST)
Start the data type tree chart with data type 'type' (-mtype).
If -m+ is specified, the output starts with the topmost data
type, this is the data type which is in the highest level of the
hierarchy tree chart. The default output is in lexicographical
- 32 -
order of the displayed data types. Useful if a selected
structure/union should be displayed at the beginning of the
output file.
-m[name] (CFT)
-mname (DFT, FFT, LFT)
Start the function tree chart dump with function 'main' (-m) or
'name' (-mname), name is case sensitive. If -m+ is specified, the
output starts with the topmost function, this is the function
which is in the highest level of the hierarchy tree chart. If
this option is not set, the default is lexicographical order of
the displayed functions.
Usually, the complete function tree chart should start with the
'main' function so that every subfunction is a (sub-)member of
'main'. This option is useful for windows programs to start the
output with the initial 'WinMain' function (-mWinMain) instead of
'main'. It can also be used to start the output with the initial
assembler start-up code being executed before the 'main'-function
is called.
-n[a] (CFT, CST, DFT, FFT, LFT)
Display the most critical function call path respectively display
the data structure/union with the maximum nesting level. The
modificator 'a' is used to display every function/structure with
its users/callers (DEFAULT: display only deepest call path). This
option helps to determine the complexity of the function
call/data structure hierarchy and finds recursions over several
call/nesting levels. Note that for functions the maximum call
path being displayed is the result of the static source code
analysis. During program execution the call path can be even
deeper if functions are called indirectly with function pointers.
-ofile (CFT, CST, DFT, FFT, LFT)
Write the generated analysis results to file 'file'. DEFAULT
BEHAVIOUR: The file names are 'CFT.LST' for CFT/CFT386 and
'CST.LST' for CST/CST386. Possible overwriting of an existing
output file with the same name other than the default one will be
detected and prompted for user reconfirming. The resulting output
file is an ASCII text file with no formatting characters which
can be printed with every printer, viewed and/or edited with
every text editor and taken as input to word processors, for
example for documentation purposes.
-p (CFT, CST, DFT, FFT, LFT)
Calculate the program code/file size ratio for every file and
make a final summary. This option gives a short overview about
the 'real' file contents versus complexity. The computed value is
in the range from 0.000 (only comment, no code) to 1.000 (only
code, no comment). Used together with -P, the results may not be
absolutely correct because of the macro expanding and removing of
parts of the source code by '#if...' control blocks. If
preprocessing -P is enabled, comment byte count in included files
will not be performed. If option -q is set, -p will not calculate
values related with comments.
- 33 -
-q (CFT, CST)
Remove comments from preprocessed files, default is don't remove.
This option is only valid with option -P, it also affects the -p
option because counting comments is not possible and calculations
on them cannot be done.
-qn (FFT)
Set the number of continuation lines to 'n' (DEFAULT: 19 lines).
The number must be in the range from 0 to 99.
-r (CFT, CST, DFT, FFT, LFT)
This is almost the same as option -x, but an additional file
reference with the file name and the line number of the
declaration will be given (includes -x).
The -r or the -x option is STRICTLY RECOMMENDED and should be
used as a default option, because without it, every function/data
type will be completely redisplayed, including the underlying
subtree of functions or data types, whenever it occurs in the
output tree chart and so the resulting output file will grow
immense, up to several megabytes, if there is enough disk space
to write the output file.
-s (CFT, CST, DFT, FFT, LFT)
Used with -C, this option gives additional informations. For CFT
for every function: the number of lines for the function body,
the maximum brace levels, the number of bytes for the function
body and the number of comment bytes inside the function body.
The average values for every source file are computed and
displayed. For CST for every data type: number of type elements,
number of subelements (nested structures/unions).
-time (CFT, CST, DFT, FFT, LFT)
Print runtime informations about the times consumed for source
analysis, preprocessing, output dump, database reading and
writing and for other miscellaneous jobs plus the total time. The
results are given in the format MINUTE:SECOND.MILLISECOND.
-u (CFT, DFT, FFT, LFT)
List undefined functions. These functions are probably library
functions, defined in other files which have not been scanned or
are unresolved externals found by the linker.
-vpath (CFT, CST, FFT)
Set a specific path for the intermediate precompiler output file.
This option is useful to speed up execution speed when the
intermediate file can be stored on a RAM-disk so that file access
to the precompiled file is much faster than on a hard disk.
Environment variables like 'TMP' or 'TEMP' to set the path for
temporary files are not evaluated.
- 34 -
-x (CFT, CST, DFT, FFT, LFT)
Cross reference in case of multiple use. Every function and data
type will be given a unique reference number which will
furthermore be used as an identifying reference number for the
function or data type if it is again displayed. See also option
-r for further descriptions.
-y (CFT, CST, DFT, FFT, LFT)
Display cross link list of files which contain referencing and
referenced functions/data types of functions/data types of a
specific file. This option shows the relations in the following
form:
1. referencing file
...
n. referencing file
file
1. referenced file
...
m. referenced file
This option is useful if you want to find out the file
relationsships. This information can be used to isolate specific
files from a project, e.g. library files. It is also useful if
you want to separate a function and want to know which other
files are needed because they contain called functions.
-z (CFT, CST, DFT, FFT, LFT)
Generate a function/data type call cross reference table. For
every function/data type the location of its definition (file,
line) and a complete list of its calls/references, sorted by
files and line numbers is given in the following form:
1. function/data type (reference #) [file #], line #
[file #]: line #, ...
...
2. ...
...
The functions/data types are displayed in lexicographical order.
At the end of the section is the cross reference file list.
-// (CFT, CST)
Accept C++ comments '//...' in C source code. This option can be
used to ensure compatibility with C compilers which can also
recognize C++ comments within C source code (like Microsoft and
Borland).
-? (CFT, CST, DFT, FFT, LFT)
Shows the command line syntax and gives a short, but complete
help information about the accepted commands and their syntax.
- 35 -
COMMAND LINE FILES
cmdfile (CFT, CST, DFT, FFT, LFT)
Specifies a file with (additional) command line options. This
might be useful if the command line would be too long because of
the number of options and files declared or if you are usually
using the same options which can then be stored in a command
file. The initial '$'-character is required to mark a command
file.
filelist (CFT, CST, DFT, FFT, LFT)
A file with a list of source file(s) to be processed, wildcards
are accepted. The list file should have every file on a single
line. The rules for files containing assembler code and path
translation are described above. The initial '@'-character is
required to mark a filelist file. The '+' sign for subdirectory
processing is also possible inside the filelist file.
[+]file (CFT, CST, DFT, FFT, LFT)
The name of a source file to be processed. More than one file can
be specified on the command line. The default assumption for the
given files is that they contain C source code. Assembler source
files are only recognised by the file extension '.ASM' (80x86
MASM/TASM) and '.S' (Intel 80960, GNU).
The '+' sign indicates that, starting from the given directory,
all subdirectories should be searched recursively for the given
file name search pattern. This addition is useful if a large
software project is divided into several modules with separate
subdirectories for each module. In that case only the starting
(root-)directory with the requested file name search pattern must
be specified to search the current directory and all
subdirectories.
If the file name or the include file specification inside a file
contains a relative path ('./', '.\', '../' or '..\') it will be
translated into an absolute path starting from the current
working directory respectively in case of include files depending
on the path of the parent file. Command line wildcards '*' and
'?' are possible and will be accepted.
REMARKS ON USING OPTIONS
NONE OF THE ABOVE DESCRIBED OPTIONS IS PREDEFINED SO IT'S UP TO
THE USER HIMSELF TO CUSTOMIZE HIS PREFERRED PROCESSING BEHAVIOUR
AND OUTPUT STYLE BY ADDING CONTROL OPTIONS NEEDED THEREFORE.
This assumption seems to be the best way to give the users the
freedom of making their own decisions about the features they
really need for doing their work.
However, some of the above described options should be regarded
and used as 'DEFAULT' options to generate a readable, complete
and useful output file without unexpected side effects. So the
minimum default command lines look like
- 36 -
CFT -m -ra <files>
CST -ra <files>
Both command sets generate a complete listing containing all
items with file name and line reference and a cross reference id
for repeated use (options -ra). The option -m for CFT forces the
output to start with the 'main' function (if found). The
precompile option -P is not strictly necessary though for exact
results it should also be set together with the -T option. The
standard default command line might be
CFT -m -rauspMP -T<type> -cs -Cs -na -Zs -G <file[s]>
CST -rapMP -T<type> -cs -Cs -na -Zs -G <file[s]>
If you start using CFT and CST for your own business, take these
options as a basic set and try other options to get a feeling for
what they are useful and how they affect the output.
The large number of options may be confusing for beginners but
this is the only way to give the users the flexibility of
customising their own output. Therefore, take some time to learn
about CFT and CST and their features, read this manual carefully
and make your own experiences with this software.
It is possible to declare more than one source file, command file
and list file on the command line. In that case they will be
processed in the order they appear. Files and options can be
placed in mixed order on the command line, there is no
recommended order for them because all options (also those inside
command files!) will be processed before any source files are
scanned.
The maximum command line length for DOS is 127 characters, so
this is a system dependent 'natural' limit for the options and
file names being declared. If you have more items to declare,
place them into command list files and file list files, which do
not have such limitations.
Options can also be defined by the environment variables CFT and
CST (also used for CFT386 and CST386) like
SET CFT=...
SET CST=...
To separate single options in the environment string, spaces are
required. See also the description for the -D option for remarks
on environment variable definitions.
The rules for the interpretation of options is
1. if defined, all options in the environment variables CFT
(for CFT and CFT386) or CST (for CST and CST386) will be
taken,
2. the command line options and the option files will be
interpreted in the order they appear.
- 37 -
If an option is declared different more than once then previous
declarations will be overwritten by the newer one.
If options are represented by a single character with no
additional optional values possible like -r or -a, they can be
grouped together with a single leading '-' in front like '-rasM',
which is the same as '-r -a -s -M'. The last option however, can
have additions, for example '-rasMmWinMain' which can be
evaluated to '-r -a -s -M -mWinMain'. If an option can have an
additional parameter, the parameter must be specified without a
space between the option character. Leaving this space means that
no additional parameter is given for this option.
File names being composed of drive letter, directory name, file
name and file extension, in the following referred simply as
'path name', are treated by some special procedures to force a
unique style of their internal representation:
- path names are always considered not case sensitive, so
there is no difference in upper case, lower case and mixed
case path names (the reason is that DOS does not make any
difference),
- path names containing './', '.\', '../' and '..\' (so called
'relative paths') are expanded and transformed into absolute
paths,
- the recommended directory delimiter is '/' (UNIX-style), if
a '\' (DOS-style) is recognised in a path name, it will be
replaced by '/',
- path names are always expanded and transformed into the
default style
<DRIVE LETTER>:<DIRECTORY PATH>/<FILE NAME>
to get a unique representation for every file name that must
be handled during processing,
- file names have a DOS like maximum length of 12 characters:
'<8 characters name>.<3 characters extension>', this is also
true for the Windows NT and OS/2 versions of the SXT
programs.
These actions are done with every path name during file
processing. File names given on the command line are also
transformed.
If you want to perform database generation (option -G) for
different projects, you are responsible to separate them and
avoid overwriting of existing databases. This can be done either
by giving the databases different names so that the database
files can be placed all in the same directory, or every database
must be written into its own directory. If you want to access the
databases be sure to use the correct name and/or path, also
within the BRIEF or MicroEMACS editors.
- 38 -
COMMAND LINE EXAMPLES
1. CFT -m -rau *.c
This program invocation of CFT processes all files with the
extension ".c" in the current directory and generates an output
file starting with the "main"-function (option -m) for the output
tree. Every function will be displayed with file and line number
reference and a cross reference number (option -r). All functions
will be shown in lexicographical order (-a), also undefined ones
(-u).
2. CFT -mWinMain -rausMP -TMSC70,L -Id: -cs -Cs -na -ve: -C++
*.c ..\*.c *.cpp
This invocation is similar to the one described above with some
extensions. The source files from the current (*.c, *.cpp) and
from the parent (..\*.c) directory, they will be preprocessed
(-P) with MS-C 7.0 defines for large memory model (-TMSC70,L),
the include file path will be taken from the environment variable
"INCLUDE" (default for -P) and the path "d:" (-Id:) will also be
searched for. The precompiler output is stored in path "e:"
(-ve:). C++ extensions and keywords will be recognised if they
occur (-C++). The output will start with the "WinMain"-function
(-mWinMain). There will be a sorted call statistic (-cs) and a
function summary for every scanned file (-Cs) with additional
informations for every function (-s). The critical function call
path for all functions will be calculated and displayed (-na) and
the included files of every source file will be shown (-M).
3. CST -S"struct _test" -r *.h -W2 -C++
Start CST to scan all files in the current directory with
extension ".h" for data types. They will be displayed with file
name and line number reference and cross reference number (-r).
The output should be done for the data type 'struct _test'
(-S"struct _test"). The warning level is set to "2" (-W2).
4. CFT y.c -R -Dmain=main_entry z.c -P x.c
Start CFT to produce a reverse calling tree (-R) of the functions
found in the files "x.c", "y.c" and "z.c" in the current
directory. The files will be preprocessed (-P) before file scan,
the name "main" will be replaced by "main_entry" during
preprocessing (-Dmain=main_entry).
5. CST $cst1.cmd $cst2.cmd -ve\tmp: @cstfiles +*.h -olist.v1a
This invocation of CST receives its options from the command
files "cst1.cmd" and "cst2.cmd" and stores the preprocessor
output in path "e:\tmp" (-ve:\tmp). The files being processed are
defined in the source list file "cstfiles" and on the command
line by "+*.h". The "+*.h" file specification searches the
current directory and all subdirectories for files with the
extension ".h". The output file will be named "list.v1a"
(-olistv1a).
- 39 -
6. CFT -ra -PGNUINC -TGNU -M c:\gnu\src\*.c c:\gnu\src\*.s -d10
CFT scans all files with extension ".c" and ".s" in the directory
"c:\gnu\src". They will be preprocessed with an include file path
defined in environment variable "GNUINC" (-PGNUINC) for compiler
type "GNU" (-TGNU). The output contains all functions (-a) with
complete reference information (-r) and a list of all included
files for every source file (-M). The output tree will be
truncated if the nesting level is higher than 10 (-d10).
7. CST *.c
CST processes all files with extension ".c" in the current
working directory. There are no options specified, so only the
options set by the environment variable 'CST', if present, will
be used to customise the program execution. As an example the
command line options used in example 6. can be defined as
environment variable CST by 'SET CST=-raMKPGNUINC -TGNU -d10'.
8. CFT -ra -PI960INC -TI960,KB *.c *.s
CFT scans all files with extension ".c" and ".s" in the current
directory. They will be preprocessed with an include file path
defined in environment variable "I960INC" (-PI960INC) for
compiler type "I960", 'KB' architecture (-TI960,KB). The output
contains all functions (-a) with complete reference information
(-r).
9. CFT -rRM -gproj40 -Gproj41
CFT reads the database named 'proj40' (-g) and produces as output
the reverse function call tree (-R) with complete reference
information (-r), the (include) file interdependencies (-M) and a
new database named 'proj41'.
10. CST -g -Gnew -N
CST reads the default database (-g) and produces as output
another database named 'new' (-Gnew). No other output file is
generated (-N).
11. CST -N -OTEST -O+ test.h
CST reads the file "test.h", generates no output file (-N), but a
byte offset calculation file for data type 'TEST' (-OTEST) and
its enclosed type members (-O+).
- 40 -
9 OUTPUT DESCRIPTION AND INTERPRETATION
This section gives an overview about the files being generated by
CFT and CST and the interpretation of the results. Different
files are produced as output depending on the options being set
by the user. Usually, if -N is not set, all informations are
written to the default output file CFT.LST or CST.LST or to the
file specified by the -o option. The internal structure of these
files and their meanings are described below. If database
generation is enabled with option -G, several files are produced.
They all have a common database name to identify the files that
are related with a project. The file extension '.DBF' marks the
dBASE compatible database files, the file with the extension
'.CMD' contains the command line options and the file with the
extension '.SRC' contains all source files that were processed.
For further informations refer to the corresponding section in
the syntax description.
CFT OUTPUT
The output file is divided into several sections. Some of the
sections listed are generated by default (-), others are optional
(o) and only displayed if they are enabled by a command line
option. Also, the default sections can be customised to produce
the desired output. The sections generated for CFT are (in the
order they appear):
- file header
- function calltree/called-by hierarchy listing (-r, -R, -x,
-a, -m, -f, -dn, -V, -l)
- function summary
- multiple defined functions and their location (only if
detected)
- overloaded functions and their location (only if detected)
o undefined functions (-u)
o function call statistics (-c[s])
o function caller/member relations (-Z[s])
o function call cross reference table (-z)
o critical function call path (-n[a])
o source file - include file dependency (-M)
o function tables for source files (-C[s], -s, -q)
- file information summary (-p, -q)
Each function is displayed like:
int test() (1) <DMPCA> <TEST.C, 100>
with the following meanings
- int : function return type
- test() : function name
- (1) : function reference number
- 41 -
- <DMPCA> : found as (one or more of)
D = definition,
M = macro,
P = prototype,
C = function call,
A = assembler function
- <TEST.C, 100>: file name, line number
The line number is the line where the function definition block
starts with its initial '{' and not the line where the function
name resides. I think that this is the best solution because it
is the point where we go really inside the function block. This
convention is also used by source level debuggers which point on
the line with the opening brace on function entry.
CST OUTPUT
The output file is divided into several sections. Some of the
sections listed are generated by default (-), others are optional
(o) and only displayed if they are enabled by a command line
option. Also, the default sections can be customised to produce
the desired output. The sections generated for CST are (in the
order they appear):
- file header
- data structure calltree/called-by hierarchy listing (-r, -R,
-x, -a, -m, -f, -dn)
- data type summary
- multiple defined data types and their location (only if
detected)
o data type call statistics (-c[s])
o data type caller/member relations (-Z[s])
o data type call cross reference table (-z)
o maximum data type nesting (-n[a])
o source file - include file dependency (-M)
o data type tables for source files (-C[s], -s, -q)
- file information summary (-p, -q)
Each data type is displayed like:
struct _test (1) <BSUCE> <TEST.C, 90> <TEST.C, 60>
with the following meanings
- struct _test : type specifier
- (1) : reference number
- <BSUCE> : data type (one/none of):
B = basic type (void, char, int, ...),
S = struct,
U = union,
C = class,
E = enum
- 42 -
- <TEST.C, 90> : file name, line number of type definition
(only printed if necessary)
- <TEST.C, 60> : file name, line number of basic type
definition
The two locations for the data type can occur if the data type is
first defined and later assigned via 'typedef' or by '#define'
(if -P is not set) to another data type name:
test.c: ...
line 60: struct xyz {...};
...
line 90: typedef struct xyz struct _test;
...
Their definition is on different lines but both data type names
refer to the same data structure.
Like the convention used for functions, the line number is the
line where the structure, union, enumeration or class type
definition block starts with its initial '{' and not the line
where the type name resides.
For an example session and more detailed informations about the
generated output of CFT and CST see the file EXAMPLE.DOC.
OUTPUT INTERPRETATION
Besides the hierarchical structure chart of the function and data
type relationships, the resulting output contains several useful
informations about the program which can be used for
optimization, reuse or maintenance purposes. Identifying the most
frequently called functions is a good way to find candidates for
further optimization. Low-level functions with many callers but
no called subfunctions are ideal for reuse. Functions with no
callers may be useless if the function is also not called via
function pointers and can be discarded therefore. The chance to
find errors in complex functions with many lines of source code,
many called functions and a lot of control statements is much
bigger than in simple functions.
- 43 -
10 INTEGRATION INTO PROGRAM DEVELOPMENT ENVIRONMENTS
Invoking CFT and CST directly from inside editors or integrated
programming environments (IDE) and displaying the results can be
a very useful feature during program development. With advanced
IDE's like that of Borland C++ or Microsoft PWB this is an easy
task.
The Borland IDE has in its system menu a section with 'transfer
items. It contains programs that can be invoked from inside the
IDE like TASM or GREP. To add CFT and CST as new entries you have
to go to the OPTIONS menu and open 'TRANSFERS...'. Choose a free
entry in the table and select EDIT. A window will open with 3
edit lines. In first line called 'Program Title' you must write
'C~FT' resp. 'C~ST' as the name being displayed in the transfer
section. The '~' prepends the hot-keys 'F' and 'S'. In the second
line called 'Program Path' you must write 'CFTIDE' resp.
'CSTIDE', maybe with the complete path, if necessary. 'CFTIDE'
and 'CSTIDE' are two batch files which perform the invocation of
CFT resp. CST together with the necessary options. These batch
files are part of the CXT package, you can change the options
defined there if you need other ones. In the third line called
'Command Line' you must write the macro commands '$EDNAME $NOSWAP
$CAP EDIT'. These macros transfer the file name in the current
edit window ($EDNAME) to the batch file, suppress window swapping
($NOSWAP) and capture the processing results in an own edit
window ($CAP EDIT). The last step is to save these entries, then
the integration is completed and CFT and CST can be used as if
they were built-in functions. The processing results are shown in
an edit window which can be scrolled, resized or moved. By adding
CFT and CST to the IDE it is much easier for the programmer to
use these tools.
- 44 -
11 TOOLS FOR DATABASE PROCESSING
To access informations stored in a database, the following
utilities are available for the SXT programs:
CFTN C Function Tree Navigator
CSTN C Structure Tree Navigator
DFTN DBASE Function Tree Navigator
FFTN FORTRAN Function Tree Navigator
LFTN LISP Function Tree Navigator
They can be used to recall the file name and line number of a
specific item (function or data type) from the database. If the
requested item is found in the database, it will be displayed
with its location where it is defined or where it is found for
the first time if there was no definition found during
processing.
As an additional feature editors like BRIEF 3.0, QEDIT 2.1/3.0 or
MicroEMACS 3.11 can be invoked directly with the informations to
open the target file and to move the cursor to the line where the
searched item is located. For BRIEF there are several macros
available to perform searching inside the editor. A new edit
window with the file at the location of the requested item will
be opened if the search was successful. Also both MicroEMACS
editor versions for DOS and WINDOWS are supported. Some of these
actions are also possible for QEDIT, with slight limitations due
to the macro programming capabilities.
Other user programmable editors which should be able to work with
CFTN and CSTN are EPSILON, ME, KEDIT, Codewright, Multi-Edit,
JED, GNU-EMACS ports like DEMACS or OEMACS, the Microsoft editor
M or integrated development environments like Borland IDE or
Microsoft PWB (this list may not be complete). You can try to
integrate CFTN and CSTN into these systems by using the BRIEF,
QEDIT or MicroEMACS macro files as examples for your own
integration development.
The version numbers for the editors mentioned in this manual
indicate those versions for which the described capabilities have
been tested.
PRECOMPILED SOURCE FILES
Sometimes, if the precompile option -P was used to process the
C/C++ source files related with the database, the results of
searches seem to be wrong. This can happen if an identifier in
the source code is in fact defined as a macro and has been
exchanged during preprocessing so that the resulting source
processed by the analyser is different from the original source
and the cursor will point to an obviously wrong location or the
search will fail. An identifier which is in fact a macro name is
unknown and not accessible after precompiling. It is also
possible that a function being used in the original source could
not be found in the database. The reason is that the function is
- 45 -
in fact a 'function like' macro and was replaced during
preprocessing. If different named macros are defined equal, a
search for an item may point to another location than the
requested. If the -P option is not set, the same item can have
several 'alias'- names due to macro defining. If the source code
contains explicit #line numbers, searching for a specific line
may also fail. Keep these exceptions in mind for a correct
interpretation the results when using the database.
IMPORTANT NOTICE
Recalling informations from the database may not be valid if
files being processed were edited and changed after the database
generation has been performed. Errors can result like pointing to
wrong files and/or lines if source lines have been deleted or
inserted, failed searches if names have changed or failed
accesses to files which may have been renamed, moved or deleted.
To avoid these errors, a consistency check for the file creation
date/time and file size will be performed by the recall programs.
If inconsistencies are recognised, the user will be informed that
the database is not up-to-date and should be updated by
processing the source files again.
SYNTAX: CFTN [options] pattern
CSTN [options] pattern
DFTN [options] pattern
FFTN [options] pattern
LFTN [options] pattern
OPTIONS
-Eeditor
Specifies the editor command line for option -e, overwrites the
default and the environment values. See the section about
environment variables for further informations about the required
format.
-F
Print all file names which are related with the database. This
option is useful to get a complete overview about all files of
the project.
-a
Print all function/data type names. Useful to generate a list of
items, for example as input to other programs.
-B
Same as -a, but prints additionally the internal database record
number. Used by BRIEF macros.
-bform
Run search in batch-mode, this means that, if the requested item
was found, the location will be displayed on a single line as
"file name line number" (DEFAULT STYLE), otherwise there will be
no output that the search failed. The output style can be changed
- 46 -
by specifying 'form' to overwrite the default style. Like for
option -E you can specify the exact locations where the file name
and line number should be inserted by defining a format string
with %s and %d (See also the section about environment
variables). For example, the format to generate a command line
for invoking BRIEF, QEDIT or MicroEMACS would look like
cstn -b"b -m\"goto_line %d\" %s" ... (BRIEF)
cstn -b"q %s -n%d" ... (QEDIT)
cstn -b"me -G%d %s" (MicroEMACS)
This option gives you a great flexibility in generating an output
for your own purposes, for example to write a batch file or for
further use in other programs.
-e
If the requested item is found, an editor will be invoked to
display the file containing the requested item. There are three
different ways to specify the editor command line (evaluated in
that order):
1) use option -E,
2) define the environment variables CFTNEDIT, CSTNEDIT or
CXTNEDIT,
3) if nothing is specified, BRIEF as the default editor (if
present) will be invoked with the file name and line number
of the item to move the cursor to its location. Ensure that
the PATH environment variable is set correctly, including
the path for the BRIEF directory.
-fname
Use 'name' as base name (path and file name) for database files.
It is also possible to use environment variables (CFTNBASE,
CSTNBASE, CXTNBASE) for the definition of the database names. If
-f and environment variables are not set, a DEFAULT NAME will be
used (see also option -G from CFT and CST syntax description).
This allows the use of different databases, for example,
generated for different projects. See also the section about
environment variables for further information.
-r#
This option prints the location for a selected item with matching
pattern and record number #. This option requires -b. Used by
BRIEF macros.
-Ritem
Print a cross reference list of every occurrence of 'item' with
complete file name and line number.
-Dfile
Print a list with the contents of 'file'.
- 47 -
-o[name]
Print output to file 'name'. If 'name' is not specified, DEFAULT
NAMES are used: CFTN.OUT resp. CSTN.OUT.
pattern
The item to search for in the database. This can either be a
function name (CFTN) or a data type name (CSTN). There are three
different ways of searching depending how 'pattern' is given:
pattern exact search,
pattern* the beginning of the item must match with pattern
*pattern a substring must match with pattern
If the item to search for consists of more than one word
(contains spaces), the search pattern must be 'quoted' like
"struct _iobuf" to ensure that these words are interpreted as
single pattern.
RETURN VALUES
The following values are returned to DOS or the calling program
to report the result of the database search:
- 100 searched item not found,
- 101 searched item found,
- 102 searched item found, but the source file may have been
changed (creation date and/or file size are not equal)
since the creation of the database (database is not
up-to-date).
The returned value can be used to decide what action should be
done for different results, for example, if the database is not
up-to-date.
ENVIRONMENT VARIABLES
CFTNEDIT, CSTNEDIT, CXTNEDIT:
The editor to invoke can be defined either by option -e or by
defining the environment variables CFTNEDIT (for CFTN), CSTNEDIT
(for CSTN) or the commonly used variable CXTNEDIT (for both CFTN
and CSTN) with the format string of the editor of your choice.
The format string can be used to specify the place where the file
name and the line number should be inserted to give additional
informations to the editor. Use %s for the file name and %d for
the line number. For example, the invocation of the default
editor BRIEF could be defined like
SET CFTNEDIT=b -m"goto_line %d" %s
SET CSTNEDIT=b -m"goto_line %d" %s
SET CXTNEDIT=b -m"goto_line %d" %s
where 'b' is the BRIEF editor, '-m' specifies the macro being
invoked when BRIEF starts, the macro name 'goto_line' with '%d'
as the place to insert the line number and '%s' as the place for
- 48 -
the file name. Note that this example cannot be used on the
command line with -E option because of the quotes. It is possible
to change the order of %d and %s if another editor is used.
Here are additional configuration examples for other popular
editors (examples are given for CFTN, similar for CSTN):
EDIT (MS-DOS 5.0): SET CFTNEDIT=edit %s or -E"edit %s" or
SET CFTNEDIT=edit or -Eedit
VDE 1.62: SET CFTNEDIT=vde %s or -E"vde %s" or
SET CFTNEDIT=vde or -Evde
QEDIT 2.1/3.0: SET CFTNEDIT=q %s -n%d or -E"q %s -n%d"
MicroEMACS 3.11: SET CFTNEDIT=me -G%d %s or -E"me -G%d %s"
The described notation allows the user to customise CFTN and CSTN
with his preferred editor and to perform additional actions
during invocation. If your editor supports macro programming like
BRIEF you are free to write your own macros to do similar things
like the CXT.CM macro given for BRIEF 3.0 does. I think this is
the most flexible way to give users control about this option and
to help them working with their preferred programming environment
and development tools.
CFTNBASE, CSTNBASE, CXTNBASE:
These environment variables can be used to specify the name of
the database. Similar to the editor environment variables,
CFTNBASE and CSTNBASE are related to CFTN and CSTN and CXTNBASE
is used for both. For example, to specify the database 'proj1'
located in directory 'd:\develop\projects' type
SET CFTNBASE=d:\develop\projects\proj1
SET CSTNBASE=d:\develop\projects\proj1
for a separate definition or
SET CXTNBASE=d:\develop\projects\proj1
for a common definition of the database name.
COMMAND LINE EXAMPLES
1) CFTN *
Displays all functions in lexicographical order with their return
types, file names and line numbers. Gives a short overview about
all functions being found.
2) CSTN -e *
Edit all data types in lexicographical order, use default or by
environment variable CSTNEDIT or CXTNEDIT defined editor.
- 49 -
3) CFTN -fproject1 -Evde -e main
Search database named 'project1' for function 'main' and edit
with editor 'vde'.
4) CSTN -b "union REGS"
Search for data type 'union REGS' and display, if found, the file
name and line number
5) CSTN -e -E"q %s -n%d" -fcft tmbuf
Search database 'cft' for data type 'tmbuf' and invoke, if found,
the editor 'q' (QEDIT 2.1/3.0) with the file name and line number
SEARCHING INSIDE BRIEF (Version 3.0)
This feature is one of the most powerful enhancements for the
BRIEF editor and offers the user full control over the complete
source code of software projects no matter how big they are and
how many files they include. It extends the BRIEF editor to a
comfortable hypertext source code browser and locator system. The
browser allows its user to find and read various important
program constructs like functions and data types in several files
simultaneously and moving between them. The complete project with
several source and include files appears as if it were a
'whole-part'. The browser helps the programmer to learn about the
existing program structures and supports him in developing new
and maintaining existing code. The programmer can use the
generated output files CFT.LST or CST.LST (or the one he created
with the -o option) to walk along the hierarchy tree chart and to
select from there the function or data type that should be
displayed in detail.
The following features are implemented as macros:
- searching for a specific item, tagged or marked
- building menus of all defined items
- building menus of all references to a specific item
- building menus of all processed files
- building menus of all items defined in the current file
- searching for a specific item cross reference number
- changing the database name
Every function and data type can be accessed with just a
keystroke by moving the cursor on it ("tagging") and executing a
macro to locate the item and zoom into the file where it is
defined. The user does no longer have to remember the file names
and locations where the functions and data types are defined nor
does he have to change the files, directories and drives to
access the files manually.
It is possible to build interactive dialog menus with all
functions or data types in lexicographical order and to select an
item to display. This is very useful to get a quick overview
about all accessible functions and data types of the whole
project. It is also possible to build an interactive dialog menu
with all file names in lexicographical order which are stored in
- 50 -
the database and to select one file to open for edit. Other menus
are available for file contents lists and item cross references.
All informations to perform these actions are stored in the
databases generated by processing the files related with the
project.
To invoke CFTN and CSTN inside BRIEF, the macro file CXT.CM must
be loaded (with <F9> CXT.CM), which makes the implemented macros
available. These macros are
MACRO NAME KEY ASSIGNMENT (defined in CXTKEYS.CM)
cft Shift F1
cftmenu Shift F2
cftxrefmenu Shift F3
cftxrefmenuagain Shift F4
cftdefmenu Shift F7
cftfilemenu Shift F8
cftfind Shift F11
cftbase Shift F12
cst Ctrl F1
cstmenu Ctrl F2
cstxrefmenu Ctrl F3
cstxrefmenuagain Ctrl F4
cstdefmenu Ctrl F7
cstfilemenu Ctrl F8
cstfind Ctrl F11
cstbase Ctrl F12
cxtbase Alt Tab
cxtsearchxref Ctrl Tab
cxthelp <unassigned>
This macro key assignment list is also available within BRIEF as
a help screen which can be invoked by the macro 'cxthelp'. The
CXT help information is not part of the BRIEF help system because
this would need modifications of the original BRIEF help files.
Instead of loading the file CXT.CM and typing the macro names
manually, you can load the macro file CXTKEYS.CM which performs
automatic loading of the CXT.CM file if any of the above listed
macros is invoked with a hot-key. To simplify working with this
package, the CXTKEYS.CM macro file also contains key assignments
for the macros. These hot-keys offer a "point and shoot"
hypertext like feeling. The macro source file CXTKEYS.CB contains
the source code for CXTKEYS.CM so that you are able to make
changes like the key assignments for your personal needs or to
move the initialization function to the BRIEF start-up macro file
(For further informations about BRIEF macros see the BRIEF
manuals). To load these macros and to execute CFTN and CSTN,
which are invoked from inside BRIEF, be sure to set the directory
path correctly. It is also necessary to allow access to the macro
file DIALOG.CM which contains the functions for dialog menu
building and processing.
- 51 -
A search can be started by simply moving the cursor on the item
to search for or by marking a block with the item (necessary if
search pattern contains more than one word like 'struct xyz') and
then running one of the following macros (or press hot-keys):
<F10> cft (function search)
<F10> cst (data type search)
It is also possible to type the name of the item to search for
manually. To do this you must run one of the following macros:
<F10> cftfind <item> (function search)
<F10> cstfind <item> (data type search)
If the search was successful, a new window with the file
containing the item will be opened and the cursor will be placed
at the line where the item is located. If inconsistencies have
been detected, the user will be informed. If the requested item
or the source file containing the item is not found, a message
will be given. The macros for building the function and data type
dialog menu are
<F10> cftmenu (function menu)
<F10> cstmenu (data type menu)
You can scroll through the entries and select an item which
should be displayed. To access databases other than the default
ones, there are two ways to change the base names:
1) Set the environment variables CFTNBASE, CSTNBASE or CXTNBASE
(see description above). By loading the macro file CXT.CM
these variables will be used for initialization.
2) To change the base names from inside BRIEF, there are three
macros to do this. They overwrite the initial values given
by the environment variables:
<F10> cftbase change base name for function search
<F10> cstbase change base name for data type search
<F10> cxtbase change both CFT and CST base name
With these features it is possible to set default values for the
database files or to change between different databases without
leaving BRIEF which gives the user a maximum of flexibility. You
can display a menu list with all source files being scanned for
the database by typing
<F10> cftfilemenu (CFT file menu)
<F10> cstfilemenu (CST file menu)
With this feature you can get a quick overview about all files
related with the database. Other menu driven options concern the
displaying of all cross references to a specific item (see macro
'cst' for informations about marking) with the macros
- 52 -
<F10> cftxrefmenu (CFT cross reference menu)
<F10> cftxrefmenuagain (show previous menu again)
<F10> cstxrefmenu (CST cross reference menu)
<F10> cstxrefmenuagain (show previous menu again)
and the displaying of a file contents list for the actual source
file with the macros
<F10> cftdefmenu (CFT file menu)
<F10> cstdefmenu (CST file menu)
To search for the first appearance of a specific cross reference
number like '(123)' in a CFT or CST output listing file, move the
cursor to the reference number and type
<F10> cxtsearchxref (search cross reference)
The macro extracts the complete number and searches for its first
occurrence by starting from the beginning of the output file.
With this macro you can move quickly from any reference to its
initial description.
All the above described macro functions are defined in the BRIEF
macro file CXT.CB. These macros make extensive use of the several
options of CFTN resp. CSTN, which are described earlier in
detail.
SEARCHING INSIDE QEDIT (2.1 and 3.0)
The popular shareware editor QEDIT with its macro programming
capabilities allows, like the BRIEF editor, the searching of
functions and data types from inside the editor. The following
examples for QEDIT macros act, with slight limitations, like the
BRIEF macros 'cft' and 'cst':
CFT function searching, assigned to <SHIFT F9>:
#f9 MacroBegin MarkWord Copy Dos 'cftn -b ' Paste '>tmp' Return
Return EditFile 'tmp' Return AltWordSet MarkWord Copy
DefaultWordSet EditFile Paste Return EditFile 'tmp' Return
EndLine CursorLeft MarkWord Copy Quit NextFile GotoLine
Paste Return
CST data type searching, assigned to <SHIFT F10>:
#f10 MacroBegin MarkWord Copy Dos 'cstn -b ' Paste '>tmp' Return
Return EditFile 'tmp' Return AltWordSet MarkWord Copy
DefaultWordSet EditFile Paste Return EditFile 'tmp' Return
EndLine CursorLeft MarkWord Copy Quit NextFile GotoLine
Paste Return
These QEDIT macro definitions can be placed into the
'qconfig.dat' configuration file and added to 'q.exe' with the
'qconfig.exe' configuration utility (For additional details about
QEDIT macro programming see the QEDIT documentation). The two
- 53 -
macros perform the following actions: mark the current word,
execute the CFTN or CSTN database search for the marked word via
dos and redirect the output to file 'tmp', read target file name
from 'tmp' and open target file, read line number from 'tmp' and
go to the selected line.
These macros are working almost similar to those used from BRIEF,
but they have some limitations in their functionality due to the
limited capabilities of the QEDIT macro programming language:
- there is no error check for a correct cursor location,
- the searched item must always be a single word like 'main'
or 'size_t', a combined pattern like 'struct iobuf' cannot
be searched,
- there is no error check if the search was successful or
failed or the database is not up-to-date,
- if the target file is the same as that from which the search
started and other additional files are also open (QEDIT ring
buffer), probably a wrong file will be accessed,
- the name of the database cannot be changed, the searches are
performed either with the default database or those defined
by the environment variables.
SEARCHING INSIDE MicroEMACS (Version 3.11, DOS & WINDOWS)
The latest editor which is now supported with macros for database
access is MicroEMACS 3.11. The macro file is named CXT_ME.CMD and
should be place in the MircoEMACS directory. This macro file
works with the DOS and the WINDOWS version of MicroEMACS 3.11.
The following macros are available:
- cft function search for tagged item
- cst data type search for tagged item
- cftmark function search for marked item
- cstmark data type search for marked item
- cftfind function search for user defined item
- cstfind data type search for user defined item
- cftfile list of all CFT files
- cstfile list of all CST files
- cftbase set CFT database name
- cstbase set CST database name
- cxtbase set both CFT and CST database name
They can be invoked by loading the macro file CXT_ME.CMD with
ESC CTRL+S CXT_ME.CMD
and running the macro with
ESC CTRL+E <macro name>
If the macros are used with the MicroEMACS WINDOWS version, you
may have to change the DOSEXEC.PIF file, which is part of the
MicroEMACS 3.11 distribution package. During the CXT macro
- 54 -
execution, the shell command may stop after execution and waits
for the <return> key pressed to continue. To avoid this
interruption, you can enable it by editing the PIF file and
select "Close window after execution". The environment variables
CFTNBASE, CSTNBASE and CXTNBASE are used in the same way as in
the BRIEF version. Key-assignments to macro procedure names are
not performed, if you prefer hot-keys, you are free to do this
for yourself.
In the MicroEMACS WINDOWS version, however, the user accessible
macros can be integrated into the "Miscellaneous" pull-down menu
(thanks to the incredible macro programming capabilities of
MicroEMACS!). To view the generated output file with its
semigraphic frames, change the font type and select for example
the 'TERMINAL' font from the OEM font list which supports
semigraphic characters.
- 55 -
12 TROUBLE SHOOTING
This section contains informations about problems and the several
reasons which may occur during the use of SXT programs. It is
strictly recommended that users should read the complete
documentation to have an overview about the features before they
start using CFT and CST and run into any unexpected troubles. See
also the chapter about 'PROGRAM LIMITATIONS'.
A PROGRAM CANNOT BE EXECUTED
The program path is not specified in the environment variable
PATH, the programs are not yet installed in the specified
directory, attempt to start a 386 protected mode version on a
80286 (or lower) computer.
EXECUTION STOPPED WITH MESSAGE "OUT OF MEMORY"
An attempt to allocate memory has failed. Try to remove
unnecessary memory resident TSR programs and/or use the protected
mode versions if you have an 386/486. If this message happens for
the protected mode versions, there is not enough free disk space
for the swap file. Set the temporary directory, defined by 'TMP'
resp. 'TEMP' environment variables, to another drive, if
possible.
WRITING THE OUTPUT FILE TAKES A LONG TIME
A large number of informations must be handled, option -x or -r
is not set and so the output tree chart is very large, slow CPU
and/or harddisk. Use option -v to redirect intermediate files to
a faster RAM-disk (if such is present).
THE RESULTING OUTPUT IS DEEPLY NESTED AND EXCEEDS THE SCREEN SIZE
Two reasons: Use the -r or -x option if not already specified or
the source code/data types are indeed deeply nested.
THE BRIEF MACROS CANNOT BE EXECUTED
The macro file is not loaded, other macros with the same names or
assigned keys already exist.
THE BRIEF OR MICROEMACS MACROS CANNOT BE LOADED
The path to the macro file location must be specified when
loading the macros, if they are not in the default directory for
the editor.
THE BRIEF MACROS DO NOT FIND ANY FUNCTIONS OR DATA TYPES
There is no access to CFTN, CSTN, DFTN (...), due to incorrect
path specification, no database is present, the path to the
database files is incorrect, the database name is incorrect.
THE BYTE OFFSET CALCULATION FILE "CST_OFFS.C" CANNOT BE COMPILED
Several reasons: Necessary data types or include files are not
specified or the CST processing was done with include files other
than those being used for compiling. If the number of data type
informations is too large, some compilers cannot compile the
large number of statements in a single file generated from CST
('out of heap space', 'code segment too large' or other messages
- 56 -
like that). In that case you may have to split the file into
several smaller files or reduce the number of data types to
display.
LOCATING ITEMS IN THE BRIEF EDITOR POINTS TO WRONG PLACES
Searching items from within the BRIEF editor points to wrong
lines, the requested item is not present there or the file seems
to be corrupted. This can have several reasons: The file is not
up-to-date and has been changed since the database generation so
that the line references are no longer valid. Another reason can
be that the source file has explicit #line numbers as it is usual
for files produced by source code generators like YACC/BISON or
LEX/FLEX. A third reason may be that the source file was
generated on an UNIX system and has therefore only LF instead of
CR+LF as end-of-line delimiter so that BRIEF cannot display the
file correctly, the file seems to be written in a single line.
UNEXPECTED RESULTS WHILE RUNNING UNDER WINDOWS 3.1
The 386 versions cannot run under Windows 3.1, they are using the
CPU exclusive and can therefore not co-exist with Windows, only
the real mode versions can. In Windows enhanced mode (virtual 386
mode), the real mode versions cannot run simultaneously in
several independent DOS-windows if they are working in the same
directory or use the same temporary directory, because the
temporary intermediate files may have the same names and will
conflict due to multiple accesses to the same file. This may also
happen if the same files are scanned.
MICROEMACS FOR WINDOWS SEEMS TO HANG DURING DATABASE ACCESS AND
DOES NOT RETURN
The reason is usually quite simple: The shell call to DOS through
DOSEXEC.PIF waits for a keystroke to continue execution and to
return to WINDOWS. You may change this behaviour by editing the
DOSEXEC.PIF file (see MicroEMACS section for further
information).
- 57 -
13 FREQUENTLY ASKED QUESTIONS
ARE THERE ANY RESTRICTIONS IN THE USE OF THE ANALYSIS RESULTS?
No restrictions for registered users! They can use the results
for all purposes like program documentation, customer information
or debugging. A notice about the name of the program is very
welcome.
WHY ARE THERE NO INTERACTIVE VERSIONS AVAILABLE?
Interactive menu driven SAA-like programs are user friendly but
need a lot of work to program them and require much memory. As
the analysis tools need very much memory especially for large
software projects, I have decided to satisfy at first the memory
needs. The main focus is that the main work should be done on the
internal analysis methods and not on the user interface layout.
In a future release there may be also MS-Windows (3.1, Win32s,
NT) versions with interactive user interface. An advantage of the
command line versions is the possibility that they can be run
from within an editor or a MAKE file.
WHY ARE SEVERAL DATABASE FILES FOR EVERY PROJECT GENERATED?
Separating the analysis items (identifier names, file names,
relationships, ...) of one project into several closely related
database files is the best way to achieve minimum storage
requirements and to optimise disk usage. This way of storage has
no redundancies compared to storage in a single database file.
WHY IS THERE NO CROSS REFERENCE FOR VARIABLES INCLUDED?
This would need much additional memory and slows down the
analysis process. There would also be a lot of multiple defined
names in different contexts to be managed if several files are
analysed. There also exist a lot of tools which perform this task
quite good.
WHY ARE CFT AND CST NOT COMBINED IN ONE PROGRAM?
Historical and practical reasons: the CFT development was started
before CST and both programs are optimised for their own special
purposes. Combining them would complicate them and slow down the
analysis process. Also the memory requirements would grow.
WHY DO THE NEW SXT PROGRAM PACKAGES DXT, FXT AND LXT NOT START
WITH VERSION 1.00?
Because they are directly derived from CXT. This means that they
share a lot of common source code with CFT and CST. Every
language independent feature is provided by all programs (see
options). Therefore it is easier to have a similar version number
for all SXT programs for maintenance and release purposes. This
may change for future version.
- 58 -
14 REFERENCES
Brian W. Kernighan, Dennis M. Ritchie: "The C Programming
Language", Prentice Hall, Englewood Cliffs, Second Edition 1988
Samuel P. Harbison, Guy L. Steele Jr.: "C: A Reference Manual",
Prentice Hall, Englewood Cliffs, Third Edition 1991
Bjarne Stroustrup: "The C++ Programming Language",
Addison-Wesley, Second Edition 1992
Margaret A. Ellis, Bjarne Stroustrup: "The Annotated C++
Reference Manual" (ARM), Addison-Wesley, Second Edition 1991
"Working Paper for Draft Proposed International Standard for
Information Systems - Programming Language C++", AT&T, ANSI
committee X3J16, ISO working group WG21, January 28, 1993
Bjarne Stroustrup, Keith Gorlen, Phil Brown, Dennis Mancl, Andrew
Koenig: "UNIX System V - AT&T C++ Language System, Release 2.1 -
Selected Readings", AT&T, 1989
Goldberg, A.: "Programmer as Reader", IEEE Software, September
1987
L.W. Cannon, R.A. Elliot, L.W. Kirchhoff, J.H. Miller, J.M.
Milner, R.W. Mitze, E.P. Schan, N.O. Whittington, H. Spencer, D.
Keppel, M. Brader: "Recommended C Style and Coding Standards",
Technical Report, in the Public Domain, Revision 6.0, July 1991
(revised and updated version of the 'AT&T Indian Hill style
guide', can be obtained via anonymous FTP from cs.washington.edu
in '~ftp/pub/cstyle.tar.Z')
A. Dolenc, A. Lemmke, D. Keppel, G.V. Reilly: "Notes on Writing
Portable Programs in C", Technical Report, in the Public Domain,
Revision 8, November 1990 (can be obtained via anonymous FTP from
cs.washington.edu in '~ftp/pub/cport.tar.Z')
M. Henricson, E. Nyquist: "Programming in C++, Rules and
Recommendations", Technical Report, in the Public Domain,
Ellemtel Telecommunication Systems Laboratories, Alvsjo/Sweden,
Document No. M 90 0118 Uen, Rev. C (can be obtained via anonymous
FTP from various sites as 'rules.ps.Z' or 'c++rules.ps.Z')
Compiler reference manuals and related documentations (language
references, language implementations and extensions):
- Microsoft C 5.1
- Microsoft C 6.0
- Microsoft C/C++ 7.0
- Microsoft C/C++ for Windows NT (Beta Release March 1993)
- Microsoft VC++ 1.0 for Windows NT (Beta Release June 1993)
- Microsoft C for SCO UNIX System V Rel. 3.2
- Microsoft Macro Assembler MASM 5.1
- Borland Turbo C++ 1.0
- Borland C++ 2.0
- 59 -
- Borland C++ 3.1
- Borland Turbo Assembler TASM 2.0
- Intel 80860 Metaware High C i860 APX (UNIX-hosted)
- Intel 80960 C-Compiler (ic960, ec960)
- Intel 80960 Assembler (asm960)
- GNU-960 Tools (UNIX-hosted)
- GNU-C Compiler 2.2.2 (C, C++, Objective-C)
- GNU Assembler
- AT&T C++ 2.1 CFRONT (C++ to C translator) for SCO UNIX
System V Rel. 3.2
- IBM C-Compilers (CC, XLC) for IBM RS 6000 RISC stations,
AIX 3.15
- HP C-Compilers (CC, C89) for HP Apollo 9000 RISC stations,
HP-UX 9.0
- VAX C
- 60 -
15 TRADEMARKS
All brand or product names are trademarks (TM) or registered
trademarks (R) of their respective owners.
The following products and names are Copyright (C) Juergen
Mueller (J.M.), all rights reserved world-wide:
CXT (TM) C EXPLORATION TOOLS
CFT (TM) C FUNCTION TREE GENERATOR
CFTN (TM) C FUNCTION TREE NAVIGATOR
CST (TM) C STRUCTURE TREE GENERATOR
CSTN (TM) C STRUCTURE TREE NAVIGATOR
DXT (TM) DBASE EXPLORATION TOOLS
DFT (TM) DBASE FUNCTION TREE GENERATOR
DFTN (TM) DBASE FUNCTION TREE NAVIGATOR
FXT (TM) FORTRAN EXPLORATION TOOLS
FFT (TM) FORTRAN FUNCTION TREE GENERATOR
FFTN (TM) FORTRAN FUNCTION TREE NAVIGATOR
LXT (TM) LISP EXPLORATION TOOLS
LFT (TM) LISP FUNCTION TREE GENERATOR
LFTN (TM) LISP FUNCTION TREE NAVIGATOR
The packages CXT, DXT, FXT and LXT are part of
SXT (TM) SOFTWARE EXPLORATION TOOLS
which provide a similar set of functionalities for the source
code analysis of different programming languages.
See PRODUCT.DOC for a complete overview of the SXT packages and
the different supported platforms.
- 61 -
APPENDIX 1: C-PRECOMPILER DEFINES
The following list shows the precompiler defines for the
supported compiler types (option -T). It contains the default
defines and the optional memory model and architecture defines.
Other default compiler defines which are usually declared by some
of the compilers are not automatically defined by the -T option.
These are defines for compilation like WINDOWS, __WINDOWS__,
_Windows, DLL or __DLL__, for optimization like __OPTIMIZE__ or
__FASTCALL__ or others like those about target (operating-)
systems like NT, MIPS, UNIX, unix, __unix__, i386, __i386__,
GNUDOS, BSD, VMS, USG, DGUX or hpux. Other sometimes predefined
macros are __STRICT_ANSI__ or __CHAR_UNSIGNED__. If necessary,
they can be user defined on the command line with the -D option.
The macro name __cplusplus will be defined if the command line
option '-C++' is set to enable C++ processing.
1. MSC51 (Microsoft C 5.1):
Default defines: MSDOS, M_I86
C++ specific defines: (none)
Memory model defines: M_I86SM, M_I86MM, M_I86CM, M_I86LM,
M_I86HM
2. MSC70 (Microsoft C/C++ 7.0):
Default defines: MSDOS, M_I86, _MSC_VER (=700)
C++ specific defines: (none)
Memory model defines: M_I86TM, M_I86SM, M_I86MM, M_I86CM,
M_I86LM, M_I86HM
3. MSVCWNT (Microsoft VC++ 1.0 for Windows NT):
Default defines: MSDOS, M_I86, _MSC_VER (=800),
_M_IX86 (=300)
C++ specific defines: (none)
Memory model defines: (not necessary)
4. TC10 (Borland Turbo C++ 1.0):
Default defines: __MSDOS__, __TURBOC__
C++ specific defines: __TCPLUSPLUS
Memory model defines: __TINY__, __SMALL__, __MEDIUM__,
__COMPACT_, __LARGE__, __HUGE__
5. BC20 (Borland C++ 2.0):
Default defines: __MSDOS__, __BORLANDC__ (=0x0200),
__TURBOC__ (=0x0297)
C++ specific defines: __BCPLUSPLUS__ (=0x0200),
__TCPLUSPLUS__ (=0x0200)
Memory model defines: __TINY__, __SMALL__, __MEDIUM__,
__COMPACT_, __LARGE__, __HUGE__
6. BC31 (Borland C++ 3.1):
Default defines: __MSDOS__, __BORLANDC__ (=0x0410),
__TURBOC__ (=0x0410)
C++ specific defines: __BCPLUSPLUS__ (=0x0310),
__TCPLUSPLUS__ (=0x0310)
- 62 -
Memory model defines: __TINY__, __SMALL__, __MEDIUM__,
__COMPACT_, __LARGE__, __HUGE__
6. BC10OS2 (Borland C++ 1.0 for OS/2):
Default defines: __OS2__, __BORLANDC__ (=0x0400),
__TURBOC__ (=0x0400)
C++ specific defines: __BCPLUSPLUS__ (=0x0320),
__TCPLUSPLUS__ (=0x0320),
__TEMPLATES__
Memory model defines: (not necessary)
8. GNU (GNU C 2.2.2):
Default defines: __GNUC__ (=2)
C++ specific defines: __GNUG__ (=2)
Memory model defines: (not necessary)
9. I960 (Intel iC960 3.0):
Default defines: __i960
C++ specific defines: (none)
Memory model defines: (not necessary)
Architecture defines: __i960KA, __i960KB, __i960SA, __i960SB,
__i960MC, __i960CA
- 63 -
APPENDIX 2: RESERVED C/C++ KEYWORDS
The following list shows the keywords being recognised by CFT and
CST, the standard C keywords, the C++ keywords and the
non-standard keywords which are compiler dependent extensions to
the C or C++ language. Standard C keywords are also C++ keywords,
always! The C++ keywords are recognised only if option '-C++' is
set, otherwise they are treated as identifiers. This list may not
be complete or correct due to upcoming new releases of the
supported compilers with new extensions or extensions to the
language standard. C++, for which till now no 'real' language
standard exists (except the de-facto standard, the AT&T CFRONT
implementation), differs among several implementations,
especially for the new introduced exception and template concepts
(try, catch, throw, template). Undocumented but (obviously)
present keywords especially in GNU C (e.g. __alignof, __classof,
...) or in Microsoft C/C++ 7.0 are ignored (even if they are
listed here).
KEYWORDS Standard compiler-specific extension
C C++ MSC TC/BC GNU C
7.0 3.0 2.2.2
asm x
auto x
break x
case x
catch x (x) x
cdecl x x
char x
class x
classof x
const x
continue x
default x
delete x
do x
double x
dynamic x
else x
enum x
except x
exception x
extern x
far x x
float x
for x
fortran x x
friend x
goto x
huge x x
if x
inline x
int x
interrupt x x
long x
near x x
- 64 -
new x
operator x
overload x x
pascal x x
private x
protected x
public x
register x
return x
short x
signed x
sizeof x
static x
struct x
switch x
template x
this x
throw x
try x (x) x
typedef x
typeof x
union x
unsigned x
virtual x
void x
volatile x
while x
__alignof x
__alignof__ x
__asm x x
__asm__ x
__attribute x
__attribute__ x
__based x
__cdecl x
__classof x
__classof__ x
__const x x
__const__ x
__emit x
__except x
__export x
__extension__ x
__far x
__fastcall x
__finally x
__fortran x
__headof x
__headof__ x
__huge x
__inline x
__inline__ x
__interrupt x
__label__ x
__loadds x
__near x
- 65 -
__saveregs x
__segment x
__segname x
__self x
__signed x
__signed__ x
__stdcall x
__syscall x
__try x
__typeof x
__typeof__ x
__volatile x
__volatile__ x
_asm x
_based x
_cdecl x
_emit x
_export x x
_far x
_fastcall x
_fortran x
_huge x
_interrupt x
_loadds x x
_near x
_pascal x
_saveregs x x
_seg x
_segment x
_segname x
_self x
- 66 -
APPENDIX 3: EFFICIENCY
To provide some values about the speed and the efficiency of the
programs, tests were performed with CFT386 and CST386 (version
2.12), running on a 33 MHz 80486 with 8 MB RAM, 256 KB cache and
a 15 ms hard disk (no disk cache or RAM-disk installed).
The source code for the first test was the C++ part of the GNU-C
compiler (version 2.2.2), which is the largest of the three
compiler parts (C, C++, Objective-C). The following results have
been found:
- 139 files (71 source files and 68 include files) have been
scanned
- a total number of 2330 functions has been found from which
2248 functions were defined in the 71 source files
- the directed call graph would have 2314 nodes and 10301
connections
- the critical function call path has a maximum nesting level
of 115
- the total size of the 139 files is 6.532 MB with 208600
lines (about 31 bytes/line), source code/filesize ratio
0.739, average function size is 1951 bytes resp. 63 lines
- the effective size of the preprocessed and scanned source
code (source files and their included files) is 20.775 MB
with 596500 lines
- the resulting output file (options -m -rauspP -TGNU -cs -Cs
-n) has about 3.94 MB and 36100 lines
- the resulting 6 database files have a size of 727 KB (source
code/database ratio is about 9 : 1)
- inside BRIEF, a database search for the location of a
function is performed in less than 4 seconds
- the total time for the complete processing was 31'03''
minutes with 26'30'' for analysis (includes 18'15'' for
preprocessing), 2'50'' for output file writing and 1'43''
for database writing
- the average analysis speed for this source code was about
783 KB/min. respectively 22510 lines/min. (The values only
for source scanning without preprocessing are: 2.51 MB/min.
resp. 72300 lines/min.)
The CFT386 results for a large commercial project are:
- 190 files (132 source files (C and assembler) and 58 include
files) have been scanned
- a total number of 1223 functions has been found from which
1177 functions were defined in the 132 source and in 3
include files (some include files contain inline functions)
- the directed call graph would have 1223 nodes and 2366
connections
- the total size of the 190 files is 6.22 MB with 145550 lines
(about 42 bytes/line), source code/filesize ratio 0.533,
average function size is 1805 bytes resp. 66 lines
- the effective size of the preprocessed and scanned source
code (source files and their included files) is 48.42 MB
with 959100 lines
- 67 -
- the resulting output file (options -m -rauspP -cs -Cs -na)
has about 907 KB and 24700 lines
- the resulting 6 database files have a size of 306 KB (source
code/database ratio is about 20 : 1)
- the total time for the complete processing was 35'25''
minutes with 34'15'' for analysis, 0'45'' for output file
writing and 0'25'' for database writing
- the average analysis speed for this source code was about
1.41 MB/min. respectively 28000 lines/min.
To get some efficiency values for CST386, the include files from
another commercial project were analysed for data types:
- 52 include files have been scanned
- a total number of 605 data types have been found from which
567 structures/unions were defined in 42 of the 54 include
files
- the directed call graph would have 588 nodes and 1787
connections
- the total size of the 52 files is 1.384 MB with 25410 lines
(about 54 bytes/line), source code/filesize ratio 0.343
- the resulting output file (options -rasp -cs -Cs -n) has
about 378 KB and 8740 lines
- the resulting 6 database files have a size of 312 KB (source
code/database ratio is about 4.4 : 1)
- the total time for the complete processing was 1'10''
minutes with 0'25'' for analysis, 0'16'' for output file
writing and 0'29'' for database writing
- the average analysis (scanning) speed for this source code
was about 3.32 MB/min. respectively 60980 lines/min (note:
NO preprocessing performed, only scanning!).
The calculated average values for the analysis speed differ due
to the effective size of the 'really' present source code in
relation to the size of the comments which can be seen by the
code/filesize ratio. The speed values do not consider that, if
the preprocessing option -P is set, the source code is first
preprocessed to a temporary file and then analysed in a second
step so that large parts of the source code are read twice
(original and preprocessed code) and written once (intermediate
preprocessor output).
With these facts in mind, the analysis speed of CFT and CST seems
to be quite acceptable!
- 68 -
APPENDIX 4: SYSTEM REQUIREMENTS
DOS real mode versions:
- IBM-AT or 100% compatible with Intel 80286 or higher, 512 KB
RAM, hard disk, DOS 3.3 or higher
DOS protected mode versions:
- IBM-AT or 100% compatible with Intel 80386+80387 or higher,
2 MB RAM, hard-disk, DOS 3.3 or higher
APPENDIX 5: INSTALLATION
See INSTALL.DOC for informations.
(THIS DOCUMENT HAS 69 PAGES)
- 69 -